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1 Atoms 


1.1 What does an atom look like? 

1.1.1 Like this? 
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Figure 1 
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Figure 3 



Figure 4 


1.1.2 Or like this? 
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What does an atom look like? 



Figure 5 p 2p0 



Figure 6 p 3p0 
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Figure 7 p 3d0 



Figure 8 p 4p o 




What does an atom look like? 



Figure 10 p^fo 
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Atoms 



Figure 12 p 5/0 


None of these images depicts an atom as it is. This is because it is impossible to even 
visualize an atom as it is. Whereas the best you can do with the images in the first row 
is to erase them from your memory, they represent a way of viewing the atom that is too 
simplified for the way we want to start thinking about it, the eight fuzzy images in the next 




Quantum states 


row deserve scrutiny. Each represents an aspect of a stationary state of atomic hydrogen. 
You see neither the nucleus (a proton) nor the electron. What you see is a fuzzy position. 
To be precise, what you see is a cloud-like blur, which is symmetrical about the vertical axis, 
and which represents the atom's internal relative position — the position of the electron 
relative to the proton or the position of the proton relative to the electron. 

• What is the state of an atom? 

• What is a stationary state? 

• What exactly is a fuzzy position? 

• How does such a blur represent the atom's internal relative position? 

• Why can we not describe the atom's internal relative position as it is ? 


1.2 Quantum states 

In quantum mechanics, states 1 are probability algorithms. We use them to calculate the 
probabilities of the possible outcomes of measurements 2 on the basis of actual measurement 
outcomes. A quantum state takes as its input 

• one or several measurement outcomes, 

• a measurement M, 

• the time of M, 

and it yields as its output the probabilities of the possible outcomes of M. 

A quantum state is called stationary if the probabilities it assigns are independent of the 
time of the measurement to the possible outcomes of which they are assigned. 

From the mathematical point of view, each blur represents a density function 3 p(r). Imagine 
a small region R like the little box inside the first blur. And suppose that this is a region of 
the (mathematical) space of positions relative to the proton. If you integrate p(r) over R, 
you obtain the probability p(R) of finding the electron in R, provided that the appropriate 
measurement is made: 


p(R)= I p(r)d 3 r. 

Jr 

"Appropriate" here means capable of ascertaining the truth value of the proposition "the 
electron is in R", the possible truth values being "true" or "false". What we see in each of 
the following images is a surface of constant probability density. 


1 http : //en . Wikipedia . org/wiki/Quantumy o 20state 

2 http : //en . Wikipedia . org/wiki/Measurementyo20in’/o20quantumy o 20mechanics 

3 http : //en . Wikipedia . org/wiki/Probabilityy o 20densityy o 20f unction 
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Figure 13 


Figure 14 


Quantum states 



Figure 15 p 3dQ 
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Atoms 



Figure 16 p 4p0 



Figure 17 p 4d0 
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Figure 18 p 4f0 



Figure 19 p 5d0 


Atoms 
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Figure 20 p 5/0 


Now imagine that the appropriate measurement is made. Before the measurement, the 
electron is neither inside R nor outside R. If it were inside, the probability of finding it 
outside would be zero, and if it were outside, the probability of finding it inside would be 
zero. After the measurement, on the other hand, the electron is either inside or outside R. 

Conclusions: 

• Before the measurement, the proposition "the electron is in R" is neither true nor false; it 
lacks a (definite) truth value 4 . 

• A measurement generally changes the state of the system on which it is performed. 

As mentioned before, probabilities are assigned not only to measurement outcomes but 
also on the basis of measurement outcomes. Each density function p n i m serves to assign 
probabilities to the possible outcomes of a measurement of the position of the electron 
relative to the proton. And in each case the assignment is based on the outcomes of a 
simultaneous measurement of three observables: the atom's energy (specified by the value of 
the principal quantum number n), its total angular momentum 5 l (specified by a letter, here 
p, d, or f), and the vertical component of its angular momentum m. 


1.3 Fuzzy observables 

We say that an observable Q with a finite or countable number of possible values q f. is fuzzy 
(or that it has a fuzzy value) if and only if at least one of the propositions "The value of Q is 


4 http : //en . Wikipedia. org/wiki/Truth"/ 0 20value 

5 http : //en . Wikipedia. org/wiki/Angular_momentumy 0 23Angular_momentuin_in_quantum_mechanics 
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Fuzzy observables 


qk" lacks a truth value. This is equivalent to the following necessary and sufficient condition: 
the probability assigned to at least one of the values q is neither 0 nor 1. 

What about observables that are generally described as continuous, like a position? 

The description of an observable as "continuous" is potentially misleading. For one thing, we 
cannot separate an observable and its possible values from a measurement and its possible 
outcomes, and a measurement with an uncountable set of possible outcomes is not even in 
principle possible. For another, there is not a single observable called "position". Different 
partitions of space define different position measurements with different sets of possible 
outcomes. 

• Corollary: The possible outcomes of a position measurement (or the possible values 
of a position observable) are defined by a partition of space. They make up a finite 
or countable set of regions of space. An exact position is therefore neither a possible 
measurement outcome nor a possible value of a position observable. 

So how do those cloud-like blurs represent the electron's fuzzy position relative to the proton? 
Strictly speaking, they graphically represent probability densities in the mathematical space 
of exact relative positions, rather than fuzzy positions. It is these probability densities that 
represent fuzzy positions by allowing us to calculate the probability of every possible value 
of every position observable. 

It should now be clear why we cannot describe the atom's internal relative position as it 
is. To describe a fuzzy observable is to assign probabilities to the possible outcomes of a 
measurement. But a description that rests on the assumption that a measurement is made, 
does not describe an observable as it is (by itself, regardless of measurements). 
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2 Serious illnesses require drastic 
remedies 


2.1 Planck 

Quantum mechanics began as a desperate measure to get around some spectacular failures 
of what subsequently came to be known as classical physics 1 . 

In 1900 Max Planck 2 discovered a law that perfectly describes the spectrum of a glowing 
hot object. Planck's radiation formula 3 turned out to be irreconcilable with the physics of 
his time. (If classical physics were right, you would be blinded by ultraviolet light if you 
looked at the burner of a stove, aka the UV catastrophe 4 .) At first, it was just a fit to the 
data, "a fortuitous guess at an interpolation formula" as Planck himself called it. Only weeks 
later did it turn out to imply the quantization of energy for the emission of electromagnetic 
radiation 5 : the energy E of a quantum 6 of radiation is proportional to the frequency v of 
the radiation, the constant of proportionality being Planck's constant 7 h : 

E = hv 

We can of course use the angular frequency 8 u = 2ms instead of u. Introducing the reduced 
Planck constant h = h/2ir , we then have 

E = Hu 


2.2 Rutherford 

In 1911 Ernest Rutherford 9 proposed a model of the atom 10 based on experiments by Geiger 
and Marsden 11 . Geiger and Marsden had directed a beam of alpha particles 12 at a thin 


1 http : //en . Wikipedia . org/wiki/Classical'/ 0 20physics 

2 http : //en . Wikipedia . org/wiki/Max°/o20Planck 

3 http : //en . Wikipedia . org/wiki/Planck"/o27s7o201awy o 20of y o 20blacky o 20bodyy o 20radiation 

4 http : //en . Wikipedia . org/wiki/UVy o 20catastrophe 

5 http : //en . Wikipedia . org/wiki/Electromagnetic"/o20radiation 

6 http : //en . Wikipedia . org/wiki/Quantum 

7 http : //en . Wikipedia . org/wiki/Plancky o 27sy o 20constant 

8 http : //en . Wikipedia . org/wiki/Angular'/o20f requency 

9 http : //en . Wikipedia . org/wiki/Ernest"/o20Rutherf ord 

10 http : //en . Wikipedia . org/wiki/Rutherf ordy o 20model 

11 http : //en . Wikipedia . org/wiki/Geiger-Marsdeny o 20experiment 

12 http : //en . Wikipedia . org/wiki/Alphay o 20particle 
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Serious illnesses require drastic remedies 


gold foil. Most of the particles passed the foil more or less as expected, but about one in 
8000 bounced back as if it had encountered a much heavier object. In Rutherford's own 
words this was as incredible as if you fired a 15 inch cannon ball at a piece of tissue paper 
and it came back and hit you. After analysing the data collected by Geiger and Marsden, 
Rutherford concluded that the diameter of the atomic nucleus (which contains over 99.9% of 
the atom's mass) was less than 0.01% of the diameter of the entire atom, and he suggested 
that atomic electrons orbit the nucleus much like planets orbit a star. 

The problem of having electrons orbit the nucleus the same way that a planet orbits a 
star is that classical electromagnetic theory demands that an orbiting electron will radiate 
away its energy and spiral into the nucleus in about 0.5 xlO " 10 seconds. This was the worst 
quantitative failure in the history of physics, under-predicting the lifetime of hydrogen by 
at least forty orders of magnitude! (This figure is based on the experimentally established 
lower bound on the proton's lifetime.) 


2.3 Bohr 

In 1913 Niels Bohr 13 postulated that the angular momentum L of an orbiting atomic electron 
was quantized: its "allowed" values are integral multiples of h : 

L = nh where n = 1,2,3, .. . 

Why quantize angular momentum, rather than any other quantity? 

• Radiation energy of a given frequency is quantized in multiples of Planck's constant. 

• Planck's constant is measured in the same units as angular momentum 14 . 

Bohr's postulate explained not only the stability of atoms but also why the emission and 
absorption of electromagnetic radiation by atoms is discrete. In addition it enabled him to 
calculate with remarkable accuracy the spectrum of atomic hydrogen — the frequencies at 
which it is able to emit and absorb light (visible as well as infrared and ultraviolet). The 
following image shows the visible emission spectrum of atomic hydrogen, which contains 
four lines of the Balmer series 1 ' 5 . 



Figure 21 Visible emission spectrum of atomic hydrogen, containing four lines of the 
Balmer series' 1 . 


a http ://en. Wikipedia. org/wiki/Balmer"/ 0 20series 


13 http ://en. Wikipedia. org/wiki/Niels"/ 0 20Bohr 

14 http ://en. Wikipedia. org/wiki/Angular/^Omomentum 

15 http ://en. Wikipedia. org/wiki/Balmer7 0 20series 
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Bohr 




Apart from his quantization postulate, Bohr's reasoning at this point remained completely 
classical. Let's assume with Bohr that the electron's orbit is a circle of radius r. The 
speed of the electron is then given by v = rd/3/dt , and the magnitude of its acceleration by 
a = dv/dt = vdfi/dt. Eliminating dfd/dt. yields a = v 2 /r. In the cgs system of units 16 , the 
magnitude of the Coulomb force 17 is simply F = e 2 /r 2 , where e is the magnitude of the 
charge of both the electron and the proton. Via Newton's 18 F = ma the last two equations 
yield m e v 2 = e 2 /r, where m e is the electron's mass. If we take the proton to be at rest, we 
obtain T = m e v 2 / 2 = e 2 /2r for the electron's kinetic energy. 

If the electron's potential energy at infinity is set to 0, then its potential energy V at a 
distance r from the proton is minus the work 19 required to move it from r to infinity, 



The total energy of the electron thus is 


16 http ://en. Wikipedia . org/wiki/Centimeter/^Ogram/^Osecond/^Osystem/^Oof '/ O 20units 

17 http : //en . Wikipedia . org/wiki/Coulomby o 27sy o 201aw 

18 http : //en . Wikipedia . org/wiki/Newtony o 27sy o 201awsy o 20of y o 20motion 

19 http : //en . Wikipedia . org/wiki/Mechanicaiy o 20work 
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E = T + V = e 2 /2 r — e 2 /r = — e 2 /2r. 

We want to express this in terms of the electron's angular momentum L = m. e vr. Remembering 
that m e v 2 = e 2 /r, and hence rm 2 v 2 = m e e 2 , and multiplying the numerator e 2 by m e e 2 and 
the denominator 2r by rm 2 v 2 , we obtain 

2 4 4 

e m e e m e e 

2r 2 m 2 v 2 r 2 2L 2 

Now comes Bohr's break with classical physics: he simply replaced L by nh. The "allowed" 
values for the angular momentum define a series of allowed values for the atom's energy: 


E r , 



n = 1,2,3,. . . 


As a result, the atom can emit or absorb energy only by amounts equal to the absolute 
values of the differences 


'A E nm — E n E m — ( n n ] Ry 1 

V n m z ) 

one Rydberg 20 (Ry) being equal to m e e 4 /2h 2 = 13.6056923(12) eV. This is also the ionization 
energy 21 AEioo of atomic hydrogen — the energy needed to completely remove the electron 
from the proton. Bohr's predicted value was found to be in excellent agreement with the 
measured value. 

Using two of the above expressions for the atom's energy and solving for r, we obtain 
r = n 2 h 2 /m e e 2 . For the ground state (n = 1) this is the Bohr radius of the hydrogen atom 22 , 
which equals h 2 /m e e 2 = 5.291772108(18) x 10 _11 m.. The mature theory yields the same 
figure but interprets it as the most likely distance from the proton at which the electron 
would be found if its distance from the proton were measured. 


2.4 de Broglie 

In 1923, ten years after Bohr had derived the spectrum of atomic hydrogen by postulating 
the quantization of angular momentum, Louis de Broglie 23 hit on an explanation of why the 
atom's angular momentum comes in multiples of h. Since 1905, Einstein 24 had argued that 
electromagnetic radiation itself was quantized (and not merely its emission and absorption, 


20 http : //en. Wikipedia. org/wiki/Rydberg 

21 http : //en . Wikipedia. org/wiki/Ionization'/ 0 20potential 

22 http : //en . Wikipedia. org/wiki/Bohr'/ 0 20radius 

23 http : //en . Wikipedia. org/wiki/Louis"/o20dey o 20Broglie 

24 http : //en . Wikipedia. org/wiki/Albert"/ 0 20Einstein 
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de Broglie 


as Planck held). If electromagnetic waves can behave like particles (now known as photons 2 ' 5 ), 
de Broglie reasoned, why cannot electrons behave like waves? 

Suppose that the electron in a hydrogen atom is a standing wave 26 on what has so far been 
thought of as the electron's circular orbit. (The crests, troughs 27 , and nodes 28 of a standing 
wave are stationary.) For such a wave to exist on a circle, the circumference of the latter 
must be an integral multiple of the wavelength 29 A of the former: 27rr = nX. 



Figure 23 


25 http ://en. Wikipedia . org/wiki/Photon 

26 http : //en . Wikipedia . org/wiki/Standing°/ 0 20wave 

27 http : //en . Wikipedia . org/wiki/Crestyo20y o 28physicsy o 29 

28 http : //en . Wikipedia . org/wiki/Nodey o 20y o 28physicsy o 29 

29 http : //en . Wikipedia . org/wiki/Wavelength 
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Schrodinger 



Figure 26 


Einstein had established not only that electromagnetic radiation of frequency v comes in 
quanta of energy E = hv but also that these quanta carry a momentum p = h/X. Using this 
formula to eliminate A from the condition 27rr = nX, one obtains pr = nh. But pr = mvr is 
just the angular momentum L of a classical electron with an orbit of radius r. In this way 
de Broglie derived the condition L = nh that Bohr had simply postulated. 


2.5 Schrodinger 

If the electron is a standing wave, why should it be confined to a circle? After de Broglie's 
crucial insight that particles are waves of some sort, it took less than three years for the 
mature quantum theory to be found, not once but twice, by Werner Heisenberg 30 in 1925 
and by Erwin Schrodinger 31 in 1926. If we let the electron be a standing wave in three 
dimensions, we have all it takes to arrive at the Schrodinger equation, which is at the heart 
of the mature theory. 

Let's keep to one spatial dimension. The simplest mathematical description of a wave of 
angular wavenumber 32 k = 2ir/X and angular frequency 33 co = 2ir/T = 2 ttu (at any rate, if 
you are familiar with complex numbers 34 ) is the function 

ip(x,t) = e i ^ kx - ut K 

Let's express the phase 35 4>(x,t) = kx — ut in terms of the electron's energy E = hu = Hlo 
and momentum p = h/X = hk : 


30 http : //en . Wikipedia . org/wiki/Werner"/ 0 20Heisenberg 

31 http : //en . Wikipedia . org/wiki/Erwiny o 20Schr'/oF6dinger 

32 http : //en . Wikipedia . org/wiki/wavenumber 

33 http : //en . Wikipedia . org/wiki/Angular'/ 0 20f requency 

34 http : //en . Wikipedia . org/wiki/Complexy o 20number 

35 http : //en . Wikipedia . org/wiki/Phasey o 20y o 28wavesy o 29 
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= e i ( px Et )/ h . 

The partial derivatives 36 with respect to x and t are 
8 = \P^ and W = ~\ E ^- 

We also need the second partial derivative of v/> with respect to x: 



We thus have 


Eip = ifrjft, Pi’ = ~ih^, and p 2 i[)= ~h 2 ^. 

In non-relativistic classical physics 37 the kinetic energy 38 and the kinetic momentum p of a 
free particle 39 are related via the dispersion relation 40 


E = p 2 /2m. 

This relation also holds in non-relativistic quantum physics. Later you will learn why. 

In three spatial dimensions, p is the magnitude of a vector p. If the particle also has a 
potential energy V(r ,t) and a potential momentum A(r,f) (in which case it is not free), and 
if E and p stand for the particle's total energy and total momentum, respectively, then the 
dispersion relation is 


E-V = (p- A) 2 /2m. 

By the square of a vector v we mean the dot (or scalar) product 41 v • v. Later you will learn 
why we represent possible influences on the motion of a particle by such fields 42 as V(r,t) 
and A(r,f). 

Returning to our fictitious world with only one spatial dimension, allowing for a potential 
energy V(x,t), substituting the differential operators 43 ih-j^ and -h 2 -^ for E and p 2 in 
the resulting dispersion relation, and applying both sides of the resulting operator equation 
to -0, we arrive at the one-dimensional (time-dependent) Schrodinger equation: 


ih §± 

in at 


AL<9 2 b 

2 in dx 2 


+ Vip 


36 http : //en . Wikipedia. org/wiki/Partial°/o20derivative 

37 http : //en . Wikipedia. org/wiki/Classicaiy o 20physics 

38 http : //en . Wikipedia. org/wiki/Kinetic"/ 0 20energy 

39 http : //en. Wikipedia. org/wiki/Free'/ 0 20particle 

40 http : //en. Wikipedia. org/wiki/Dispersion'/ 0 20relation 

41 http : //en . Wikipedia. org/wiki/Dot"/ 0 20product 

42 http : //en . Wikipedia. org/wiki/Fieldy o 20y o 28physicsy o 29 

43 http : //en . Wikipedia. org/wiki/Dif f erential'/o20operator 
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In three spatial dimensions and with both potential energy V(r,t) and potential momentum 
A(r ,t) present, we proceed from the relation E — V = (p — A) 2 /2 m, substituting ih ^ for 
E and — ih/^. for p. The differential operator is a vector whose components are the 
differential operators • The result: 

= 2k - a) 2 V’+^, 

where if is now a function of r = ( x,y,z ) and t. This is the three-dimensional Schrodinger 
equation. In non-relativistic investigations (to which the Schrodinger equation is confined) 
the potential momentum can generally be ignored, which is why the Schrodinger equation 44 
is often given this form: 

*% = -£ (& + & + &) + v i> 

The free Schrodinger equation (without even the potential energy term) is satisfied by 
if(x,t) = (in one dimension) or if(r,t) = ghk-r-wt) (j n three dimensions) provided 

that E = hu equals p 2 /2m. = (hk) 2 /2m., which is to say: u(k) = hk 2 /2m.. However, since we 
are dealing with a homogeneous linear differential equation 45 — which tells us that solutions 
may be added and/or multiplied by an arbitrary constant to yield additional solutions — 
any function of the form 

t p(x,t) = / V’(fc) / if(k,t) e lkx dk 

with if(k,t) = 'if(k)e~ iuj ^ k ' )t solves the (one-dimensional) Schrodinger equation. If no inte- 
gration boundaries are specified, then we integrate over the real line 46 , i.e., the integral is 
defined as the limit lim^oo f^j/ . The converse also holds: every solution is of this form. 
The factor in front of the integral is present for purely cosmetic reasons, as you will realize 
presently. ip(k,t) is the Fourier transform 47 of if(x,t), which means that 

ip(k,t.) = ^=/ if{x,f)e~' Lkx dx. 

The Fourier transform of if{x,f) exists because the integral / \if(x,t)\dx is finite. In the next 
section 48 we will come to know the physical reason why this integral is finite. 

So now we have a condition that every electron "wave function" must satisfy in order to 
satisfy the appropriate dispersion relation. If this (and hence the Schrodinger equation) 
contains either or both of the potentials 49 V and A, then finding solutions can be tough. 
As a budding quantum mechanician, you will spend a considerable amount of time learning 
to solve the Schrodinger equation with various potentials. 


44 http ://en. Wikipedia . org/wiki/Schr°/oF6dingery o 20equation 

45 http ://en. Wikipedia . org/wiki/Dif f erentiaT/ 0 20equation 

46 http : //en . Wikipedia . org/wiki/ReaT/ 0 201ine 

47 http : //en . Wikipedia . org/wiki/Fourier'/ 0 20transf orm 

48 Chapter 2.5 on page 26 

49 http : //en . Wikipedia . org/wiki/Potential 
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2.6 Born 

In the same year that Erwin Schrodinger published the equation that now bears his name, 
the nonrelativistic theory was completed by Max Born's 51 insight that the Schrodinger wave 
function 52 ip(r,t) is actually nothing but a tool for calculating probabilities, and that the 
probability of detecting a particle "described by" ij){r ,t) in a region of space R is given by 
the volume integral 53 

Sr \ ip(t,r)\ 2 d 3 r = J R 'ip*'ipd 3 r 

— provided that the appropriate measurement is made, in this case a test for the particle's 
presence in R. Since the probability of finding the particle somewhere (no matter where) 
has to be 1, only a square integrable 54 function can "describe" a particle. This rules out 
^(r) = e ik r , which is not square integrable. In other words, no particle can have a momentum 
so sharp as to be given by h times a wave vector 55 k, rather than by a genuine probability 
distribution over different momenta. 

Given a probability density function \ij}(x)\ 2 , we can define the expected value 56 


(x) = f \ip(x) | 2 xdx = / ijf xi^dx 
and the standard deviation 5 ' Ax = y f \y\ 2 (x — (x)) 2 
as well as higher moments 58 of \y(x)\ 2 . By the same token, 

(k) = f i/j* ki/idk and A k = \J f \ip\ 2 (k — ( k )) 2 . 

Here is another expression for ( k ) : 

(k)=jy*(x) (~i£)tl>(x)dx. 

To check that the two expressions are in fact equal, we plug y(x) = (2 tt)^ 1 ^ 2 J ip{k) e lkx dk 
into the latter expression: 

(k) = / y*(x) (—iSx) f ^{k)e lkx dkdx = f tp(k) ke lkx dkdx. 


50 http ://en. wikibooks . org/wiki/Category7 0 3A 

51 http : //en . Wikipedia. org/wiki/Max720Born 

52 http : //en . Wikipedia. org/wiki/Wavef unction 

53 http : //en . Wikipedia. org/wiki/Volume^Ointegral 

54 http : //en . Wikipedia. org/wiki/Integrable'/ 0 20f unction 

55 http : //en . Wikipedia. org/wiki/Wave'/ 0 20vector 

56 http : //en . Wikipedia. org/wiki/Expected°/ 0 20value 

57 http : //en . Wikipedia. org/wiki/Standard’/ 0 20deviation 

58 http : //en . Wikipedia. org/wiki/Momenty o 20yo28mathematicsy o 29 
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Next we replace ip*(x) by (27 r) J'i/j*(k')e lk ' x dk! and shuffle the integrals with the 

mathematical nonchalance that is common in physics: 


(k)=jfr(k')w(k) 


j e i(k-k’) x dx 


dkdk'. 


The expression in square brackets is a representation of Dirac's delta distribution 59 5(k — k'), 
the defining characteristic of which is f(x)5(x)dx = /( 0) for any continuous function 
f(x). (In case you didn't notice, this proves what was to be proved.) 


2.7 Heisenberg 

In the same annus mirabilis of quantum mechanics, 1926, Werner Heisenberg 60 proved the 
so-called "uncertainty" relation 61 


Ax Ap > ft/2. 

Heisenberg spoke of Unschdrfe, the literal translation of which is "fuzziness" rather than 
"uncertainty". Since the relation Ax A k > 1/2 is a consequence of the fact that 'ip(x) and 
'tp(k) are related to each other via a Fourier transformation 62 , we leave the proof to the 
mathematicians. The fuzziness relation for position and momentum follows via p = hk. 
It says that the fuzziness of a position (as measured by Ax ) and the fuzziness of the 
corresponding momentum (as measured by Ap = HAk ) must be such that their product 
equals at least ft/2. 

63 


59 http ://en. Wikipedia . org/wiki/Dirac"/ 0 20deltay o 20f unction 

60 http : //en . Wikipedia . org/wiki/Werner"/ 0 20Heisenberg 

61 http : //en . Wikipedia . org/wiki/Uncertainty"/ 0 20principle 

62 http : //en . Wikipedia . org/wiki/Fourier'/ 0 20transf orm 

63 http : //en . wikibooks . org/wiki/Category°/ 0 3A 
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3 The Feynman route to Schrodinger 


The probabilities of the possible outcomes of measurements performed at a time t .2 are deter- 
mined by the Schrodinger wave function ^(r,^)- The wave function ?/>( r T 2 ) is determined 
via the Schrodinger equation 1 by -0(r,ti). What determines r,ti ) ? Why, the outcome of 
a measurement performed at t\ - what else? Actual measurement outcomes determine the 
probabilities of possible measurement outcomes. 


3.1 Two rules 

In this chapter we develop the quantum-mechanical probability algorithm from two funda- 
mental rules. To begin with, two definitions: 

• Alternatives are possible sequences of measurement outcomes. 

• With each alternative is associated a complex number 2 called amplitude. 

Suppose that you want to calculate the probability of a possible outcome of a measurement 
given the actual outcome of an earlier measurement. Here is what you have to do: 

• Choose any sequence of measurements that may be made in the meantime. 

• Assign an amplitude to each alternative. 

• Apply either of the following rules: 


Rule A: If the intermediate measurements are made (or if it is possible to infer from other 
measurements what their outcomes would have been if they had been made), first square 
the absolute values of the amplitudes of the alternatives and then add the results. 

Rule B: If the intermediate measurements are not made (and if it is not possible to infer 
from other measurements what their outcomes would have been), first add the amplitudes 
of the alternatives and then square the absolute value of the result. 

In subsequent sections we will explore the consequences of these rules for a variety of setups, 
and we will think about their origin — their raison d'etre. Here we shall use Rule B to 
determine the interpretation of ^(k) given Born's probabilistic interpretation of ip(x). 

In the so-called "continuum normalization", the unphysical limit of a particle with a sharp 
momentum hk' is associated with the wave function 

= -T= f 5{k-k , )e i[kx ~ U} ( k)t] dk= - 
V 27t J \/2vr 


1 Chapter 2.4 on page 23 

2 http : //en . Wikipedia . org/wiki/Complex/^Onumber 
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Hence we may write ip(x,t) = f ip(k)'il>k(x,t) dk. 

V’(fc) is the amplitude for the outcome hk of an infinitely precise momentum measurement. 
ipk(x,t) is the amplitude for the outcome x of an infinitely precise position measurement 
performed (at time t) subsequent to an infinitely precise momentum measurement with 
outcome hk. And ?/>( x,t ) is the amplitude for obtaining x by an infinitely precise position 
measurement performed at time t. 

The preceding equation therefore tells us that the amplitude for finding x at t is the product 
of 


1. the amplitude for the outcome hk and 

2. the amplitude for the outcome x (at time t ) subsequent to a momentum measurement 
with outcome hk, 

summed over all values of k. 

Under the conditions stipulated by Rule A, we would have instead that the probability for 
finding x at t is the product of 

1. the probability for the outcome hk and 

2. the probability for the outcome x (at time t) subsequent to a momentum measurement 
with outcome hk, 

summed over all values of k. 

The latter is what we expect on the basis of standard probability theory. But if this holds 
under the conditions stipulated by Rule A, then the same holds with "amplitude" substituted 
from "probability" under the conditions stipulated by Rule B. Hence, given that ip k(x,t ) 
and ip(x,t ) are amplitudes for obtaining the outcome x in an infinitely precise position 
measurement, ip(k) is the amplitude for obtaining the outcome hk in an infinitely precise 
momentum measurement. 

Notes: 

1. Since Rule B stipulates that the momentum measurement is not actually made, we 
need not worry about the impossibility of making an infinitely precise momentum 
measurement. 

2. If we refer to \ip(x)\ 2 as "the probability of obtaining the outcome x," what we mean 
is that |'i/;(x)| 2 integrated over any interval or subset of the real line 3 is the probability 
of finding our particle in this interval or subset. 


3 http : //en . Wikipedia. org/wiki/ReaT/ 0 201ine 
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An experiment with two slits 


3.2 An experiment with two slits 





Figure 27 The setup 


In this experiment, the final measurement (to the possible outcomes of which probabilities 
are assigned) is the detection of an electron at the backdrop, by a detector situated at D 
(D being a particular value of x). The initial measurement outcome, on the basis of which 
probabilities are assigned, is the launch of an electron by an electron gun G. (Since we 
assume that G is the only source of free electrons, the detection of an electron behind the 
slit plate also indicates the launch of an electron in front of the slit plate.) The alternatives 
or possible intermediate outcomes are 

• the electron went through the left slit (L), 

• the electron went through the right slit (R). 

The corresponding amplitudes are Al and Ar. 

Here is what we need to know in order to calculate them: 

• Al is the product of two complex numbers, for which we shall use the symbols (D\L) 
and (L\G). 

• By the same token, Ar = (D\R) ( R\G ). 
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• The absolute value of {B\A) is inverse proportional to the distance d(BA) between A 
and B. 

• The phase of (B\A) is proportional to d(BA). 

For obvious reasons (B\A) is known as a propagator. 


3.2.1 Why product? 

Recall the fuzziness (''uncertainty 1 ') relation 4 , which implies that A p — > oo as Ax — >• 0. In 
this limit the particle's momentum is completely indefinite or, what comes to the same, has 
no value at all. As a consequence, the probability of finding a particle at B, given that it 
was last "seen" at A, depends on the initial position A but not on any initial momentum, 
inasmuch as there is none. Hence whatever the particle does after its detection at A is 
independent of what it did before then. In probability-theoretic terms this means that the 
particle's propagation from G to L and its propagation from L to D are independent events. 
So the probability of propagation from G to D via L is the product of the corresponding 
probabilities, and so the amplitude of propagation from G to D via L is the product 
(D\L) (L\G) of the corresponding amplitudes. 

3.2.2 Why is the absolute value inverse proportional to the distance? 

Imagine (i) a sphere of radius r whose center is A and (ii) a detector monitoring a unit 
area of the surface of this sphere. Since the total surface area is proportional to r 2 , and 
since for a free particle the probability of detection per unit area is constant over the entire 
surface (explain why!), the probability of detection per unit area is inverse proportional 
to r 2 . The absolute value of the amplitude of detection per unit area, being the square root 
of the probability, is therefore inverse proportional to r. 

3.2.3 Why is the phase proportional to the distance? 

The multiplicativity of successive propagators implies the additivity of their phases. Together 
with the fact that, in the case of a free particle, the propagator (B\A) (and hence its phase) 
can only depend on the distance between A and B, it implies the proportionality of the 
phase of (B\A) to d(BA). 

3.2.4 Calculating the interference pattern 

According to Rule A, the probability of detecting at D an electron launched at G is 
Pa(D) = \(D\L)(L\G)\ 2 + \(D\R)(R\G)\ 2 . 

If the slits are equidistant from G, then (L\G) and {R\G) are equal and pa(D) is proportional 
to 


4 Chapter 2.5 on page 26 
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\{D\L)\ 2 + \(D\R)\ 2 = l/d 2 (DL) + l/d 2 (DR). 

Here is the resulting plot of pa against the position x of the detector: 



Figure 28 Predicted relative frequency of detection according to Rule A 


Pa(x) (solid line) is the sum of two distributions (dotted lines), one for the electrons that 
went through L and one for the electrons that went through R. 

According to Rule B, the probability pb(D) of detecting at D an electron launched at G is 
proportional to 

\(D\L) + ( D\R )\ 2 = 1 /d 2 {DL) + l/d 2 (DR) + 2cos(kA)/[d(DL)d(DR)], 

where A is the difference d(DR) — d(DL ) and k = p/h is the wavenumber, which is sufficiently 
sharp to be approximated by a number. (And it goes without saying that you should check 
this result.) 

Here is the plot of pb against x for a particular set of values for the wavenumber, the 
distance between the slits, and the distance between the slit plate and the backdrop: 
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Figure 29 Predicted relative frequency of detection according to Rule B 


Observe that near the minima the probability of detection is less if both slits are open than 
it is if one slit is shut. It is customary to say that destructive interference occurs at the 
minima and that constructive interference occurs at the maxima, but do not think of this as 
the description of a physical process. All we mean by "constructive interference" is that a 
probability calculated according to Rule B is greater than the same probability calculated 
according to Rule A, and all we mean by "destructive interference" is that a probability 
calculated according to Rule B is less than the same probability calculated according to 
Rule A. 

Here is how an interference pattern builds up over time 5 : 



Figure 30 100 electrons 


5 A. Tonomura, J. Endo, T. Matsuda, T. Kawasaki, & H. Ezawa, "Demonstration of single-electron 
buildup of an interference pattern", American Journal of Physics 57 , 117-120, 1989. 
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Figure 31 3000 electrons 



Figure 32 20000 electrons 



Figure 33 70000 electrons 


6 


3.3 Bohm's story 

3.3.1 Hidden Variables 

Suppose that the conditions stipulated by Rule B' are met: there is nothing — no event, 
no state of affairs, anywhere, anytime — from which the slit taken by an electron can be 
inferred. Can it be true, in this case, 


6 http : //en . wikibooks . org/wiki/Category"/ 0 3A 

7 Chapter 2.7 on page 27 


35 




The Feynman route to Schrodinger 


• that each electron goes through a single slit — either L or R — and 

• that the behavior of an electron that goes through one slit does not depend on whether 
the other slit is open or shut? 

To keep the language simple, we will say that an electron leaves a mark where it is detected 
at the backdrop. If each electron goes through a single slit, then the observed distribution 
of marks when both slits are open is the sum of two distributions, one from electrons that 
went through L and one from electrons that went through R: 


Pb(x) = Pl(x)+Pr(x) 

If in addition the behavior of an electron that goes through one slit does not depend on 
whether the other slit is open or shut, then we can observe pl(x) by keeping R shut, and 
we can observe pr(x) by keeping L shut. What we observe if R is shut is the left dashed 
hump, and what we observed if L is shut is the right dashed hump: 



Hence if the above two conditions (as well as those stipulated by Rule B) are satisfied, we 
will see the sum of these two humps. In reality what we see is this: 
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Thus all of those conditions cannot be simultaneously satisfied. If Rule B applies, then 
either it is false that each electron goes through a single slit or the behavior of an electron 
that goes through one slit does depend on whether the other slit is open or shut. 

Which is it? 

According to one attempt to make physical sense of the mathematical formalism of quantum 
mechanics, due to Louis de Broglie 8 and David Bohm 9 , each electron goes through a single 
slit, and the behavior of an electron that goes through one slit depends on whether the other 
slit is open or shut. 

So how does the state of, say, the right slit (open or shut) affect the behavior of an electron 
that goes through the left slit? In both de Broglie's pilot wave theory and Bohmian 
mechanics 10 , the electron is assumed to be a well-behaved particle in the sense that it follows 
a precise path — its position at any moment is given by three coordinates — and in addition 
there exists a wave that guides the electron by exerting on it a force. If only one slit is 
open, this passes through one slit. If both slits are open, this passes through both slits 
and interferes with itself (in the "classical" sense of interference). As a result, it guides the 
electrons along wiggly paths that cluster at the backdrop so as to produce the observed 
interference pattern: 


8 http : //en . Wikipedia . org/wiki/Louisy o 2Cyo207thyo20ducyo20dey o 20Broglie 

9 http : //en . Wikipedia . org/wiki/Davidy o 20Bohm 

10 http : //en . Wikipedia . org/wiki/Bohmy,20interpretation 
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Figure 36 none 


According to this story, the reason why electrons coming from the same source or slit arrive 
in different places, is that they start out in slightly different directions and/or with slightly 
different speeds. If we had exact knowledge of their initial positions and momenta, we could 
make an exact prediction of each electron's subsequent motion. This, however, is impossible. 
The uncertainty principle 11 prevents us from making exact predictions of a particle's motion. 
Hence even though according to Bohrn the initial positions and momenta are in possession 
of precise values, we can never know them. 

If positions and momenta have precise values, then why can we not measure them? It used 
to be said that this is because a measurement exerts an uncontrollable influence on the 
value of the observable being measured. Yet this merely raises another question: why do 
measurements exert uncontrollable influences? This may be true for all practical purposes, 
but the uncertainty principle does not say that Ax A p > h/2 merely holds for all practical 
purposes. Moreover, it isn't the case that measurements necessarily "disturb" the systems 
on which they are performed. 

The statistical element of quantum mechanics is an essential feature of the theory. The 
postulate of an underlying determinism, which in order to be consistent with the theory has 
to be a cn/pfo 12 -determinism, not only adds nothing to our understanding of the theory but 


11 Chapter 2.7 on page 27 

12 http : //en . Wikipedia. org/wiki/Crypto 
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also precludes any proper understanding of this essential feature of the theory. There is, in 
fact, a simple and obvious reason why hidden variables 13 are hidden: the reason why they 
are strictly (rather than merely for all practical purposes) unobservable is that they do not 
exist. 

At one time Einstein insisted that theories ought to be formulated without reference to 
unobservable quantities. When Heisenberg later mentioned to Einstein that this maxim had 
guided him in his discovery of the uncertainty principle, Einstein replied something to this 
effect: "Even if I once said so, it is nonsense." His point was that before one has a theory, 
one cannot know what is observable and what is not. Our situation here is different. We 
have a theory, and this tells in no uncertain terms what is observable and what is not. 


3.4 Propagator for a free and stable particle 

3.4.1 The propagator as a path integral 

Suppose that we make m intermediate position measurements at fixed intervals of 
duration At. Each of these measurements is made with the help of an array of detec- 
tors monitoring n mutually disjoint regions Rk, k = 1, . . . ,n. Under the conditions stipulated 
by Rule B, the propagator 14 (B\A) now equals the sum of amplitudes 


£•" £ (B\R k J-.-(R k2 \R kl )(R kl \A). 

k\ = \ km, —— 1 

It is not hard to see what happens in the double limit At — > 0 (which implies that m — >• oo) 
and n — > oo. The multiple sum ]U(! 1=1 • • • J2k m =i becomes an integral / VC over continuous 
spacetime paths from A to B, and the amplitude (B\Rk m ) ■ ■ ■ (Rk^A) becomes a complex- 
valued functional Z[C : A — > B] — a complex function of continuous functions representing 
continuous spacetime paths from A to B: 

(B\A) = j VC Z[C: A B\ 


The integral / VC is not your standard Riemann integral 1 ' 5 dx f(x) . to which each infinites- 
imal interval dx makes a contribution proportional to the value that f(x) takes inside the 
interval, but a functional or path integral 16 , to which each "bundle" of paths of infinitesimal 
width VC makes a contribution proportional to the value that Z[C] takes inside the bundle. 

As it stands, the path integral / VC is just the idea of an idea. Appropriate evalutation 
methods have to be devised on a more or less case-by-case basis. 


13 http : //en . Wikipedia . org/wiki/Hiddeny o 20variable"/o20theory 

14 Chapter 3.1 on page 30 

15 http : //en . Wikipedia . org/wiki/Riemann'/ 0 20integral 

16 http : //en . Wikipedia . org/wiki/FunctionaT/ 0 20integration 
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3.4.2 A free particle 

Now pick any path C from A to B, and then pick any infinitesimal segment dC of C. Label the 
start and end points of dC by inertial coordinates 17 t,x,y,z and t + dt,x + dx,y + dy,z + dz, 
respectively. In the general case, the amplitude Z(dC ) will be a function of t,x,y,z and 
dt,dx,dy,dz. In the case of a free particle, Z(dC) depends neither on the position of dC 
in spacetime (given by t,x,y,z) nor on the spacetime orientiaton of dC (given by the 
four- velocity 18 ( cdt/ds,dx/ds,dy/ds,dz/ds ) but only on the proper time 19 interval ds = 
'Jdt 2 — (dx 2 + dy 2 + dz 2 ) / c 2 . 

(Because its norm equals the speed of light, the four- velocity depends on three rather than 
four independent parameters. Together with ds, they contain the same information as the 
four independent numbers dt,dx,dy,dz.) 

Thus for a free particle Z(dC) = Z(ds). With this, the multiplicativity of successive propaga- 
tors 20 tells us that 


Y[Z{d Sj ) = Z(j2dsj) ds ) 

3 V 3 JC 

It follows that there is a complex number z such that Z[C ] = e zs \ c '- A -^ B \, where the line 
integral 21 s[C : A — > B] = j c ds gives the time that passes on a clock as it travels from A to 
B via C. 


3.4.3 A free and stable particle 

By integrating |(B|yf)| 2 (as a function of r#) over the whole of space, we obtain the 
probability of finding that a particle launched at the spacetime point still exists at 

the time t B . For a stable particle this probability equals 1: 


Vb \(t B ,r B \tA,rA)\ 2 = 3 r B 


j VC e zs i c - A ^ B ] 


2 


= 1 


If you contemplate this equation with a calm heart and an open mind, you will notice that 
if the complex number z = a + ib had a real part a/0, then the integral between the two 
equal signs would either blow up (a > 0) or drop off (a < 0) exponentially as a function of 
t B , due to the exponential factor e as ^ c \ 


3.4.4 Meaning of mass 

The propagator for a free and stable particle thus has a single "degree of freedom": it depends 
solely on the value of b. If proper time is measured in seconds, then b is measured in radians 


17 http ://en. Wikipedia. org/wiki/InertiaT/ 0 20f ramey o 20ofy o 20ref erence 

18 http : //en . Wikipedia. org/wiki/Four- velocity 

19 http : //en. Wikipedia. org/wiki/Proper"/ 0 20time 

20 Chapter 3.2.1 on page 32 

21 http : //en. Wikipedia. org/wiki/Line'/ 0 20integral 
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per second. We may think of e ibs , with s a proper-time parametrization of C, as a clock 
carried by a particle that travels from A to B via C, provided we keep in mind that we are 
thinking of an aspect of the mathematical formalism of quantum mechanics rather than an 
aspect of the real world. 

It is customary 

• to insert a minus (so the clock actually turns clockwise!): Z = eT ibs ^ c \ 

• to multiply by 2t: (so that we may think of b as the rate at which the clock "ticks" — the 
number of cycles it completes each second): Z = e -* 27rfes [ c ] ) 

• to divide by Planck's constant h (so that b is measured in energy units and called the 
rest energy 22 of the particle): Z = e -d 2n / h ) bs[C] = e ~C/h)bs\c\ ^ 

• and to multiply by c 2 (so that b is measured in mass units and called the particle's rest 
mass 23 ): z = e~W b(ps W. 

The purpose of using the same letter b everywhere is to emphasize that it denotes the 
same physical quantity, merely measured in different units. If we use natural units 24 in 
which h = c = 1, rather than conventional ones, the identity of the various fo's is immediately 
obvious. 


3.5 From quantum to classical 

3.5.1 Action 

Let's go back to the propagator 


(B\A) = j VC Z[C : A— > B\. 

For a free and stable particle we found that 

Z[C\ = e-W*) mc2s[C] , s[C] = f c ds, 

where ds = \J dt. 2 — (dx 2 + dy 2 + dz 2 ) /c 2 is the proper-time interval associated with the 
path element dC. For the general case we found that the amplitude Z(dC) is a function 
of t,x,y : z and dt,,dx,dy,dz or, equivalently, of the coordinates t,x,y,z , the components 
cdt/ds,dx/ds,dy/ds,dz/ds of the 4- velocity, as well as ds. For a particle that is stable but 
not free, we obtain, by the same argument that led to the above amplitude, 

Z[C\ = e { - i/h)s[c \ 


22 http : //en . Wikipedia . org/wiki/Rest"/ 0 20energy 

23 http : //en . Wikipedia . org/wiki/Invariant"/ 0 20mass 

24 http : //en . Wikipedia . org/wiki/Naturaiy o 20units 
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where we have introduced the functional 5[C] = J c dS, which goes by the name action. 

For a free and stable particle, <S[C] is the proper time (or proper duration) s[C] = f c ds 
multiplied by —me 2 , and the infinitesimal action dS[dC ] is proportional to ds: 

S[C] = —mc 2 s[C], dS[dC] = —mc 2 ds. 

Let's recap. We know all about the motion of a stable particle if we know how to calculate 
the probability p(B\A) (in all circumstances). We know this if we know the amplitude (B\A). 
We know the latter if we know the functional Z[C\. And we know this functional if we know 
the infinitesimal action dS{t,x,y,z,dt,dx,dy,dz) or dS(t,r,dt,dr) (in all circumstances). 

What do we know about dS ? 

The multiplicativity of successive propagators implies the additivity of actions associated 
with neighboring infinitesimal path segments dC\ and dC 2 - In other words, 

e (i/h)dS(dC 1 +dC 2 ) = e (i/H)dS(dC 2 ) e {i/h)dS(dC{) 


implies 


dS{dC 1 + dC 2 ) = dS{dC 1 ) + dS{dC 2 ). 

It follows that the differential dS is homogeneous (of degree 1) in the differentials dt,dr: 


dS(t,r,Xdt,Xdr) = XdS(t,r,dt,dr). 

This property of dS makes it possible to think of the action 5[C] as a (particle-specific) 
length associated with C, and of dS as defining a (particle-specific) spacetime geometry. By 
substituting 1 jdt for A we get: 


<is(t ' r - vl= *' 

Something is wrong, isn't it? Since the right-hand side is now a finite quantity, we shouldn't 
use the symbol dS for the left-hand side. What we have actually found is that there is a 
function L(t, r,v), which goes by the name Lagrange function, such that dS = Ldt. 

3.5.2 Geodesic equations 

Consider a spacetime path C from A to B. Let's change ("vary") it in such a way that 
every point (t, r) of C gets shifted by an infinitesimal amount to a corresponding point 
(t + St, r + dr), except the end points, which are held fixed: St = 0 and dr = 0 at both A 
and B. 

If t—t t + St, then dt = t 2 — 1\ — > t 2 + St 2 — (t± + St±) = (t 2 — fi) + {St 2 — St\) = dt + dSt. 

By the same token, dr — > dr + dSr. 
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In general, the change C —>C' will cause a corresponding change in the action: S[C] — > 5[C'] / 
S'[C]. If the action does not change (that is, if it is stationary at C ), 

5S= ( dS- [ dS = 0, 

Jc JC 

then C is a geodesic of the geometry defined by dS. (A function f(x) is stationary at those 
values of x at which its value does not change if x changes infinitesimally. By the same token 
we call a functional S^C] stationary if its value does not change if C changes infinitesimally.) 

To obtain a handier way to characterize geodesics, we begin by expanding 


dS(C') = <iS(i + dt,r + dr,dt + ddi,<ir + ddr) 


, ddS r dds r ddS jr ddS 

= dS(t, r, dt,dr ) H — 777 - 5t-\ — 7 — • dr + - 7-77 ddt + - 77 -r- • ddr. 


dt 


dr 


ddt 


ddr 


This gives us 


(*) / dS - dS = 

Jc Jc Jc 


ddS e ddS c ddS , r ddS , 
■aT^f + • *r + -rrcdbt + — • ddr 

dt dr ddt ddr 


Next we use the product rule for derivatives, 


/ ddS \ f,ddS\ t ddS 


V ddt 


f ddS \ f ddS\ ddS 
d {Mr- 5r > = { d Bdr)- 5 r+ Bdr- dSr - 

to replace the last two terms of (*), which takes us to 


SS = 


dds ddS\ . fddS . . 


, ddS \ 

ddr ) 


. , , ddS r ddS . 
+ / d( — -5t+ — -Sr 


ddt 


ddr 


The second integral vanishes because it is equal to the difference between the values of the 
expression in brackets at the end points A and B , where 5t = 0 and dr = 0. If C is a geodesic, 
then the first integral vanishes, too. In fact, in this case 5S = 0 must hold for all possible 
(infinitesimal) variations St and dr, whence it follows that the integrand of the first integral 
vanishes. The bottom line is that the geodesics defined by dS satisfy the geodesic equations 

dds _ J dds dds _ ,1 ddS 

dt ddt ’ dr 1 ddr ’ 
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3.5.3 Principle of least action 

If an object travels from A to B, it travels along all paths from A to B, in the same sense in 
which an electron goes through both slits. Then how is it that a big thing (such as a planet, 
a tennis ball, or a mosquito) appears to move along a single well-defined path? 

There are at least two reasons. One of them is that the bigger an object is, the harder it is to 
satisfy the conditions stipulated by Rule B. Another reason is that even if these conditions 
are satisfied, the likelihood of finding an object of mass m where according to the laws of 
classical physics it should not be, decreases as m increases. 

To see this, we need to take account of the fact that it is strictly impossible to check whether 
an object that has travelled from A to B, has done so along a mathematically precise path C. 
Let us make the half realistic assumption that what we can check is whether an object has 
travelled from A to B within a a narrow bundle of paths — the paths contained in a narrow 
tube T. The probability of finding that it has, is the absolute square of the path integral 
I(T) = which sums over the paths contained in 1~ ■ 

Let us assume that there is exactly one path from A to B for which S'fC] is stationary: its 
length does not change if we vary the path ever so slightly, no matter how. In other words, 
we assume that there is exactly one geodesic. Let's call it Q, and let's assume it lies in T. 

No matter how rapidly the phase S[C]/h changes under variation of a generic path C, it 
will be stationary at Q . This means, loosly speaking, that a large number of paths near Q 
contribute to I(T) with almost equal phases. As a consequence, the magnitude of the sum 
of the corresponding phase factors is large. 

If S[C\/h is not stationary at C, all depends on how rapidly it changes under variation of C. 
If it changes sufficiently rapidly, the phases associated with paths near C are more or less 
equally distributed over the interval [0,27 t], so that the corresponding phase factors add up 
to a complex number of comparatively small magnitude. In the limit S[C]/h — > oo, the only 
significant contributions to 7(T) come from paths in the infinitesimal neighborhood of Q. 

We have assumed that Q lies in T. If it does not, and if S[C\/h changes sufficiently rapidly, 
the phases associated with paths near any path in T are more or less equally distributed over 
the interval [0,27 t], so that in the limit S[C]/h — > oo there are no significant contributions 

to J(T). 

For a free particle, as you will remember, S'fC] = — me 2 s[C\. From this we gather that the 
likelihood of finding a freely moving object where according to the laws of classical physics 
it should not be, decreases as its mass increases. Since for sufficiently massive objects 
the contributions to the action due to influences on their motion are small compared to 
| — me 2 s[C]\, this is equally true of objects that are not moving freely. 

What, then, are the laws of classical physics? 

They are what the laws of quantum physics degenerate into in the limit h—> 0. In this limit, 
as you will gather from the above, the probability of finding that a particle has traveled 
within a tube (however narrow) containing a geodesic, is 1, and the probability of finding 
that a particle has traveled within a tube (however wide) not containing a geodesic, is 0. 
Thus we may state the laws of classical physics (for a single "point mass", to begin with) by 
saying that it follows a geodesic of the geometry defined by dS. 
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This is readily generalized. The propagator for a system with n degrees of freedom — such 
as an m-particle system with n = 3 m degrees of freedom — is 

(V f ,t f \V u ti) = J PCe^/Wl, 

where V t and V f are the system's respective configurations at the initial time ti and the final 
time tf, and the integral sums over all paths in the system's n+ 1-dimensional configuration 
spacetime leading from ( Vi,U ) to In this case, too, the corresponding classical 

system follows a geodesic of the geometry defined by the action differential dS , which 
now depends on n spatial coordinates, one time coordinate, and the corresponding n+1 
differentials. 

The statement that a classical system follows a geodesic of the geometry defined by its 
action, is often referred to as the principle of least action. A more appropriate name is 
principle of stationary action. 


3.5.4 Energy and momentum 

Observe that if dS does not depend on t (that is, ddS/dt = 0 ) then 

F = _ d _ds 

ddt 

is constant along geodesics. (We'll discover the reason for the negative sign in a moment.) 
Likewise, if dS does not depend on r (that is, ddS/dr = 0 ) then 

_ ddS 

^ ddr 


is constant along geodesics. 

E tells us how much the projection dt of a segment dC of a path C onto the time axis 
contributes to the action of C. p tells us how much the projection dr of dC onto space 
contributes to S'fC] . If dS has no explicit time dependence, then equal intervals of the time 
axis make equal contributions to £[C], and if dS has no explicit space dependence, then 
equal intervals of any spatial axis make equal contributions to <S[C]. In the former case, equal 
time intervals are physically equivalent: they represent equal durations. In the latter case, 
equal space intervals are physically equivalent: they represent equal distances. 

If equal intervals of the time coordinate or equal intervals of a space coordinate are not 
physically equivalent, this is so for either of two reasons. The first is that non-inertial 
coordinates are used. For if inertial coordinates are used, then every freely moving point 
mass moves by equal intervals of the space coordinates in equal intervals of the time 
coordinate, which means that equal coordinate intervals are physically equivalent. The 
second is that whatever it is that is moving is not moving freely: something, no matter what, 
influences its motion, no matter how. This is because one way of incorporating effects on 
the motion of an object into the mathematical formalism of quantum physics, is to make 
inertial coordinate intervals physically inequivalent, by letting dS depend on t and/or r. 
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Thus for a freely moving classical object, both E and p are constant. Since the constancy 
of E follows from the physical equivalence of equal intervals of coordinate time (a.k.a. the 
"homogeneity" of time), and since (classically) energy is defined as the quantity whose 
constancy is implied by the homogeneity of time, E is the object's energy. 

By the same token, since the constancy of p follows from the physical equivalence of 
equal intervals of any spatial coordinate axis (a.k.a. the "homogeneity" of space), and 
since (classically) momentum is defined as the quantity whose constancy is implied by the 
homogeneity of space, p is the object's momentum. 

Let us differentiate a former result 25 , 


dS(t,r,Xdt,Xdr) = XdS(t,r,dt,dr), 

with respect to A. The left-hand side becomes 

djdS) _ ddS d(Xdt) ddS d(Xdr) _ ddS ddS 

dX d(Xdt) dX d(Xdr) dX d(Xdt) d(Xdr) 

while the right-hand side becomes just dS. Setting A = 1 and using the above definitions of 
E and p, we obtain 

— E dt + p • dr = dS. 


dS = —mc 2 ds is a 4-scalar. Since (cdt,dr) are the components of a 4-vector, the left-hand 
side, — Edt + p • dr, is a 4-scalar if and only if (E/c, p) are the components of another 
4- vector. 

(If we had defined E without the minus, this 4-vector would have the components (—E/c, p).) 

In the rest frame J-' of a free point mass, dt' = ds and dS = — me 2 dt ' . Using the Lorentz 
transformations 26 , we find that this equals 


dS = —me 


2 dt — vdx/c 2 


me 


dt T 


mv 


a/1 — v 2 jc 2 a/1 — v 2 /c 2 a/1 — v 2 / c 2 


• dr, 


where v = (e,0,0) is the velocity of the point mass in T . Compare with the above framed 
equation to find that for a free point mass, 


E = 


me 2 

\/l — v 2 /c 2 


mv 

^ a/1 — v 2 /c 2 


25 

26 


http : //en . wikibooks . org/wiki/This_quantum_world/Feynman_route/From_quantum_to_ 
classical#Action 

http : //en . wikibooks . org/wiki/This_quantum_world/Appendix/Relativity/Lorentz#The_ 
actual Lorentz transformations 
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3.5.5 Lorentz force law 

To incorporate effects on the motion of a particle (regardless of their causes) , we must modify 
the action differential dS = —m,(?dt\J 1 — v 2 /c 2 that a free particle associates with a path 
segment dC. In doing so we must take care that the modified dS (i) remains homogeneous 
in the differentials 27 and (ii) remains a 4-scalar. The most straightforward way to do this is 
to add a term that is not just homogeneous but linear in the coordinate differentials: 

(*) dS = — me 2 dt 1 — v 2 /c 2 — qV(t , r) dt + (q/c)A(t,r) ■ dr. 

Believe it or not, all classical electromagnetic effects (as against their causes) are accounted 
for by this expression. V ( t , r) is a scalar field (that is, a function of time and space coordinates 
that is invariant under rotations of the space coordinates), A(f,r) is a 3-vector field, and 
(V, A) is a 4- vector field. We call V and A the scalar potential and the vector potential, 
respectively. The particle- specific constant q is the electric charge, which determines how 
strongly a particle of a given species is affected by influences of the electromagnetic kind. 

If a point mass is not free, the expressions at the end of the previous section 28 give its kinetic 
energy E k and its kinetic momentum p k . Casting (*) into the form 

dS = ~(E k + qV) dt+ [pfc + [q/c)A] -dr 
and plugging it into the definitions 29 


(**) E 


ddS 
ddt ' 


P = 


ddS 
ddr ’ 


we obtain 


E = E k + qV, p = p k + (q/c)A. 

qV and (q/c) A are the particle's potential energy and potential momentum , respectively. 
Now we plug (**) into the geodesic equation 

ddS _ ddS 
dr ddr 

For the right-hand side we obtain 


dp k + -dA = dp k + - 

c c 


, d A ( d . . 

dt W + r'dr ]Al 


27 http : //en . wikibooks . org/wiki/This_quantum_world/Feynman_route/From_quantum_to 

classical#Action 

28 Chapter 3.5.4 on page 45 

29 Chapter 3.5.4 on page 45 
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while the left-hand side works out at 


dV q d(A ■ dr) dV , q 

-q—dt+-^~ >- = -q— dt+ - 

or c or or c 


. d\ d 

dr - — A + dr x — xA 
or J \or 


Two terms cancel out, and the final result is 


, . 8V 1 dA\ q ( d . \ , q 

dpk = Q ~~a ) dt + dr X - — x A = qEdt + dr X - B. 

or c at J c \ or J c 


= E 


= B 


As a classical object travels along the segment dQ of a geodesic, its kinetic momentum 
changes by the sum of two terms, one linear in the temporal component dt of dQ and one 
linear in the spatial component dr. How much dt contributes to the change of p/j depends 
on the electric field E, and how much dr contributes depends on the magnetic field B. The 
last equation is usually written in the form 


dpk 

dt 


q E + - v x B , 
c 


called the Lorentz force law, and accompanied by the following story: there is a physical 
entity known as the electromagnetic field, which is present everywhere, and which exerts on 
a charge q an electric force qE and a magnetic force (g/c) vxB. 

(Note: This form of the Lorentz force law holds in the Gaussian system of units 30 . In the 
MKSA system of units 31 the c is missing.) 


3.5.6 Whence the classical story? 

Imagine a small rectangle in spacetime with corners 


A = (0,0, 0,0 ),B = (dt, 0,0,0), (7 = (0,ds,0,0),D = (dt,dx, 0,0). 

Let's calculate the electromagnetic contribution to the action of the path from A to D via B 
for a unit charge (q = 1) in natural units ( c = 1 ): 

Sabd = — V(dt/2, 0,0,0) dt + A x (dt,dx/2, 0,0) dx 


= — V(dt/2, 0,0,0) dt + 


A x (0,dx/2,0,0) + 



dx. 


Next, the contribution to the action of the path from A to D via C: 
Sacd = A x (0,dx/2,0,0) dx — V(dt./2,dx,0,0) dt 


30 http : //en . Wikipedia. org/wiki/Centimetery o 20gram’/o20secondyo20systemy o 20of y o 20units 

31 http : //en. Wikipedia. org/wiki/InternationaT/o20Systemy o 20of y o 20Units 
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AS = Sacd ~ S ABD = (~-^ 


dtdx = E r dtdx. 


Alternatively, you may think of AS as the electromagnetic contribution to the action of the 
loop A— )-.D— t-C— >■ A. 


B D B D 



Figure 37 


Let's repeat the calculation for a small rectangle with corners 

A = (0,0, 0,0 ),B = (0,0,dy,0),C= (0,0,0 ,dz),D = (0,0 ,dy,dz). 

Sabd = A z (0,0,0,dz/2)dz + A y (0,0,dy/2,dz) dy 

BA 

= A z (0,0,0,dz/2)dz + A y (0, 0, dy /2,0) + — -dz dy, 

Bz 

Sacd = A y ( 0, 0, dy/ 2, 0) dy + A z ( 0, 0, dy, dz/ 2) dz 

BA 

= A y (0,0,dy/2,0)dy+ A z (0,0,0,dz/2) + — -dy dz, 

l dy J 

AS = Sacd - Sabd = dy dz = B x dy dz. 

Thus the electromagnetic contribution to the action of this loop equals the flux of B through 
the loop. 
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Remembering (i) Stokes' theorem 32 and (ii) the definition 33 of B in terms of A, we find that 


A-dr = / curl A • dY = / B-cffi. 


as 


In (other) words, the magnetic flux through a loop dY (or through any surface E bounded 
by dY ) equals the circulation of A around the loop (or around any surface bounded by the 
loop). 


The effect of a circulation A ■ dr around the finite rectangle A— > B — > D ^ C — )- A is 
to increase (or decrease) the action associated with the segment A — > B — > D relative to 
the action associated with the segment A — > C — » D. If the actions of the two segments are 
equal, then we can expect the path of least action from A to D to be a straight line. If one 
segment has a greater action than the other, then we can expect the path of least action 
from A to D to curve away from the segment with the larger action. 



Figure 38 


Compare this with the classical story, which explains the curvature of the path of a charged 
particle in a magnetic field by invoking a force that acts at right angles to both the magnetic 
field and the particle's direction of motion 34 . The quantum-mechanical treatment of the 
same effect offers no such explanation. Quantum mechanics invokes no mechanism of any 
kind. It simply tells us that for a sufficiently massive charge traveling from A to D, the 
probability of finding that it has done so within any bundle of paths not containing the 
action- geodesic connecting A with D, is virtually 0. 

Much the same goes for the classical story according to which the curvature of the path of 
a charged particle in a spacetime plane is due to a force that acts in the direction of the 
electric field. (Observe that curvature in a spacetime plane is equivalent to acceleration or 
deceleration. In particular, curvature in a spacetime plane containing the x axis is equivalent 


32 Chapter 7.2.3 on page 115 

33 Chapter 3.5.5 on page 47 

34 Chapter 3.5.5 on page 47 
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to acceleration in a direction parallel to the x axis.) In this case the corresponding circulation 
is that of the 4-vector potential (cV, A) around a spacetime loop. 

35 


3.6 Schrodinger at last 


The Schrddinger equation is non-relativistic. We obtain the non-relativistic version of the 
electromagnetic action differential, 


dS = —rru?dt\Jl — v 2 /c 2 — qV(t, r) dt + (q/c)A(t,r) - dr, 
by expanding the root and ignoring all but the first two terms: 



1 v 4 
8 c* 



This is obviously justified if v <C c, which defines the non-relativistic regime. 

Writing the potential part of dS as q [— V + A (t, r) • (v/c)] dt. makes it clear that in most non- 
relativistic situations the effects represented by the vector potential A are small compared to 
those represented by the scalar potential V. If we ignore them (or assume that A vanishes), 
and if we include the charge q in the definition of V (or assume that q = 1), we obtain 


S[C\ 


- mc 2 (t B -tA ) + 



m 2 
2 u 


v(t, r) 


for the action associated with a spacetime path C. 

Because the first term is the same for all paths from A to B, it has no effect on the differences 
between the phases of the amplitudes associated with different paths. By dropping it we 
change neither the classical phenomena (inasmuch as the extremal path remains the same) nor 
the quantum phenomena (inasmuch as interference effects only depend on those differences). 
Thus 


(B\A) 


J VCe^^ /c dt [( m / 2 )" 2-y ] 


We now introduce the so-called wave function if(t, r) as the amplitude of finding our particle 
at r if the appropriate measurement is made at time t. (f,r|i / ,r / ) ip(t',r'), accordingly, is the 
amplitude of finding the particle first at r' (at time t ') and then at r (at time t). Integrating 
over r, we obtain the amplitude of finding the particle at r (at time t). provided that Rule B 
applies. The wave function thus satisfies the equation 


ip(t, r) = 3 r' (t,r\t' ,r'). 


35 http : //en . wikibooks . org/wiki/Category°/ 0 3A 
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We again simplify our task by pretending that space is one-dimensional. We further assume 
that t and t' differ by an infinitesimal interval e. Since e is infinitesimal, there is only one 
path leading from x' to x. We can therefore forget about the path integral except for a 
normalization factor A implicit in the integration measure VC, and make the following 
substitutions: 


dt = e, 


v = 



V = V 




This gives us 


1p(t+€,x) = ^ e irn(x-x') 2 /2he e -(ie/h)V(t+e/2,(x+x')/ 

We obtain a further simplification if we introduce r\ = x' — x and integrate over instead 
of x' . (The integration "boundaries" — oo and +oo are the same for both x' and rj.) We now 
have that 


ip(t + e,x) = Ar 1 e im ^ /2he e- {i£/h)v ^ t+e/2 ’ x+ ^ 2) 'iP{t,x+ V ). 

Since we are interested in the limit e — > 0, we expand all terms to first order in e. To which 
power in rj should we expand? As rj increases, the phase mrj 2 /2he increases at an infinite 
rate (in the limit e — > 0) unless rj 2, is of the same order as e. In this limit, higher-order 
contributions to the integral cancel out. Thus the left-hand side expands to 

#Te,x) ~ rp(t,x) + ^e, 


while e (* e / ? 0W*+ e / 2 > a: + T ?/ 2 ) ip(t,x+rj) expands to 




dtp 1 d 2 rp 2 


1 -jV(t,x) 


drb d 2 rb n 2 

i ’ {t ’ x) + a^ v+ a^J- 


The following integrals need to be evaluated: 


I\ = ne imi,2 / 2he , I 2 = rje imr i 2/2he ri, h = ije^ 2 /2he rj 2 . 


The results are 


I\ = \j2TTihe/m, I2 = 0 , I3 = \j2Trh 3 e 3 /im 3 . 

Putting Hurnpty Durnpty back together again yields 


dip 

^(*’ X ) + ~dt e = ' A 


/ 27 rihe ( rs . \ A I~2tt d 2 rp 
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The factor of ip(t,x) must be the same on both sides, so A = \Jm/2mhe, which reduces 
Hurnpty Dumpty to 


dip ie ihed 2 ip 

dt € h 2 m dx 2 ' 

Multiplying by ih/e and taking the limit e — > 0 (which is trivial since e has dropped out), 
we arrive at the Schrodinger equation for a particle with one degree of freedom subject to a 
potential V(t,x): 


ih 


dip 

Ik 


Trumpets please! The transition to three 

= -fe(S+0+S)+^. 


H 2 d 2 ip 
2 m dx 2 ~ ^ 

dimensions 


- Vip. 

is straightforward: 


36 


36 http : //en . wikibooks . org/wiki/Category°/ 0 3A 
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4 The Schrodinger equation: implications 
and applications 


In this chapter we take a look at some of the implications of the Schrodinger equation 

^W = 2k(f£- A ) 2 V’ + ^- 

4.1 How fuzzy positions get fuzzier 

We will calculate the rate at which the fuzziness of a position probability distribution 
increases, in consequence of the fuzziness of the corresponding momentum, when there is 
no counterbalancing attraction (like that between the nucleus and the electron in atomic 
hydrogen) . 

Because it is easy to handle, we choose a Gaussian function 

i^,x) = Ne~ x2/2a \ 


which has a bell-shaped graph. It defines a position probability distribution 

\-iP{0,x)\ 2 = N 2 e~ x2/a2 . 


If we normalize this distribution so that / dx\ij)(Q,x)\ 2 = 1, then N 2 = 1/a^/ir, and 


\ip(0,x)\ 2 = e~ x2 1 ° 2 j o \J ts . 


We also have that 

• Ax(0) = (t/s/2, 

• the Fourier transform of ij>(0,x) is gA(0, A:) = yj cr/v / 7re - ' j2fc2 / 2 , 

• this defines the momentum probability distribution |'0(O,A;)| 2 = cre _CT2fc2 /\/7r, 

• and A/c(0) = l/a\/2. 

The fuzziness of the position and of the momentum of a particle associated with ip(0,x) is 
therefore the minimum allowed by the "uncertainty" relation 1 : Ax(0) Afc(O) = 1/2. 

Now recall that 


1 Chapter 2.7 on page 27 
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ip(t, k) = </>(0, k)e tujt , 

where uj = Hk? /2m. This has the Fourier transform 


a 


1 


^ t,X) V ^y/^ + Hh/m)^ 

and this defines the position probability distribution 


= — x 2 /2[<r 2 +i(h/m)t] 


\i/j(t,x)\ 2 = 


1 


iT'sJ a 2 + ( h 2 /m 2 a 2 ) t 2 


,- X 2 /[ ( T 2 +(S 2 /m 2 o- 2 )i 2 ] 


Comparison with |V’(0 ,,t)| 2 reveals that cr(t) = \J a 2 + (h 2 /m 2 a 2 ) t 2 . Therefore, 


a(t) / a 2 h 2 t 2 

= W = V ~2 + 2^V2 


= J[Ax(0)} 2 + 


h 2 t 2 


Am 2 [Ax(0)] 2 ' 


The graphs below illustrate how rapidly the fuzziness of a particle the mass of an electron 
grows, when compared to an object the mass of a Cgo molecule or a peanut. Here we see 
one reason, though by no means the only one, why for all intents and purposes "once sharp, 
always sharp" is true of the positions of macroscopic objects. 
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Figure 40 


Next, a Cgo molecule with A,t( 0) = 1 nanometer. In a second, A x(t) grows to 4.4 centimeters. 
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Figure 42 


4.2 Time-independent Schrodinger equation 

If the potential V does not depend on time, then the Schrodinger equation has solutions 

that are products of a time-independent function ^(r) and a time-dependent phase factor 
e —{i/h)Et. 


'iP(t,r) = 'iP{r)e-( i/h)Et . 

Because the probability density \il>(t,r)\ 2 is independent of time, these solutions are called 
stationary. 

Plug ^(r) e ~( l / h ) Et into 


ih 


dil’ 

~dt 


_ti^d_ d_ 

2 m dr dr v 


■Vi/> 


2 http : //en . wikibooks . org/wiki/Category"/ 0 3A 
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to find that -0(r) satisfies the time-independent Schrodinger equation 


Eip(r) 


h 2 

2 m 


f d 2 d 2 d 2 \ 

y dx 2 <9r/ 2 <9z 2 y 


^(r)-!- i/(r)'0(r). 


3 


4.3 Why energy is quantized 

Limiting ourselves again to one spatial dimension, we write the time independent Schrodinger 
equation in this form: 


d 2 ip(x) 
dx 2 


A(x) V>(x), 


A{x) 


2 m - 


V{x)-E . 


Since this equation contains no complex numbers except possibly ip itself, it has real solutions, 
and these are the ones in which we are interested. You will notice that if V > E, then A is 
positive and \p{x) has the same sign as its second derivative. This means that the graph of 
ip(x) curves upward above the x axis and downward below it. Thus it cannot cross the axis. 
On the other hand, if V < E, then A is negative and ip(x) and its second derivative have 
opposite signs. In this case the graph of ip(x) curves downward above the x axis and upward 
below it. As a result, the graph of ip(x) keeps crossing the axis — it is a wave. Moreover, 
the larger the difference E — V, the larger the curvature of the graph; and the larger the 
curvature, the smaller the wavelength. In particle terms, the higher the kinetic energy, the 
higher the momentum. 

Let us now find the solutions that describe a particle "trapped" in a potential well — a 
bound state. Consider this potential: 


3 http : //en . wikibooks . org/wiki/Category"/ 0 3A 
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Observe, to begin with, that at x± and X2, where E = V, the slope of x) does not change 
since d 2 i/j(x)/dx 2 = 0 at these points. This tells us that the probability of finding the particle 
cannot suddenly drop to zero at these points. It will therefore be possible to find the particle 
to the left of x\ or to the right of X2, where classically it could not be. (A classical particle 
would oscillates back and forth between these points.) 

Next, take into account that the probability distributions defined by ip(x) must be normaliz- 
able. For the graph of i/j(x) this means that it must approach the x axis asymptotically as 
x — > Too. 

Suppose that we have a normalized solution for a particular value E. If we increase or 
decrease the value of E, the curvature of the graph of i/j(x) between x\ and x 2 increases or 
decreases. A small increase or decrease won't give us another solution: ip(x) won't vanish 
asymptotically for both positive and negative x. To obtain another solution, we must increase 
E by just the right amount to increase or decrease by one the number of wave nodes between 
the ''classical 1 ' turning points x\ and X2 and to make ip(x) again vanish asymptotically in 
both directions. 

The bottom line is that the energy of a bound particle — a particle "trapped" in a potential 
well — is quantized : only certain values E k yield solutions tpk( x ) of the time-independent 
Schrodinger equation: 
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4.4 A quantum bouncing ball 

As a specific example, consider the following potential: 

V{z) = mgz if 2 > 0 and V{z) = 00 if z< 0 . 

g is the gravitational acceleration at the floor. For z < 0 , the Schrodinger equation as given 
in the previous section 4 5 tells us that d 2 il>(z) / dz 2 = 00 unless i^{z) = 0 . The only sensible 
solution for negative z is therefore 'ip(z) = 0 . The requirement that V(z) = 00 for z <0 
ensures that our perfectly elastic, frictionless quantum bouncer won't be found below the 
floor. 

Since a picture is worth more than a thousand words, we won't solve the time-independent 
Schrodinger equation for this particular potential but merely plot its first eight solutions: 


4 http : //en . wikibooks . org/wiki/Category"/ 0 3A 

5 Chapter 4.2 on page 61 
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Figure 45 


Where would a classical bouncing ball subject to the same potential reverse its direction of 
motion? Observe the correlation between position and momentum (wavenumber). 

All of these states are stationary; the probability of finding the quantum bouncer in any 
particular interval of the z axis is independent of time. So how do we get it to move? 

Recall that any linear combination of solutions of the Schrodinger equation is another 
solution. Consider this linear combination of two stationary states: 


ip(t,x) = A'lp i(x)e lLOlt + Bip2(x)e lu>2t . 

Assuming that the coefficients A,B and the wave functions i/)i(x),ip2(x) are real, we calculate 
the mean position of a particle associated with ij;{t,x): 

= (Ai/j ie luJlt + Bij) 2e iU2t ) x (Ai/j + Bil^~ lul2t ) 

= A 2 tPl x + B 2 ^lx + AB(e i ^ 1 ~ w ^ t + e^ 2 ^ 1 )*) V’l^- 

The first two integrals are the (time-independent) mean positions of a particle associated 
with ipi(x) e lUlt and V’ 2(x) e ZU}2t , respectively. The last term equals 


2 AB cos(Awf) r ipi xi/j2 , 
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and this tells us that the particle's mean position oscillates with frequency Aw = W 2 — wi 
and amplitude 2AB ipixfa about the sum of the first two terms. 

Visit this site 6 7 to watch the time-dependence of the probability distribution associated with 
a quantum bouncer that is initially associated with a Gaussian distribution. 

7 

4.5 Atomic hydrogen 

While de Broglie's theory of 1923 featured circular electron waves, Schrodinger's "wave 
mechanics" of 1926 features standing waves in three dimensions. Finding them means finding 
the solutions of the time-independent Schrodinger equation 

fp ( d 2 d 2 d 2 \ 

Ei,{l) = - ^ (a? ■ + W + a?) ^ (r) + v(r) ^ 

with V(r) = — e 2 /r, the potential energy of a classical electron at a distance r = |r| from 
the proton. (Only when we come to the relativistic theory will we be able to shed the last 
vestige of classical thinking.) 


Eip(r) 


2m ycte 2 dy 2 dz 2 J 


r 


V(r) 'ip( r). 


In using this equation, we ignore (i) the influence of the electron on the proton, whose 
mass is some 1836 times larger than that of he electron, and (ii) the electron's spin. Since 
relativistic and spin effects on the measurable properties of atomic hydrogen are rather 
small, this non-relativistic approximation nevertheless gives excellent results. 

For bound states the total energy E is negative, and the Schrodinger equation has a discrete 
set of solutions. As it turns out, the "allowed" values of E are precisely the values that Bohr 
obtained in 1913: 


En 


1 /j,e 4 
n 2 2 h 2 ’ 


n = 1,2,3, . . . 


However, for each n there are now n 2 linearly independent solutions. (If V’l , • • • , V’fc are 
independent solutions, then none of them can be written as a linear combination °f 

the others.) 

Solutions with different n correspond to different energies. What physical differences 
correspond to linearly independent solutions with the same n? 


6 http : //www .uark . edu/misc/ julio/bouncing_ball/bouncing_ball .html 

7 http : //en . wikibooks . org/wiki/Category°/ 0 3A 
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Using polar coordinates, one finds that all solutions for a particular value E n are linear 
combinations of solutions that have the form 


t/j(r,(j),9) p(r,9). 

l z turns out to be another quantized variable, for = eOI+)^z (? y:27r ) implies that l z = mh 

with m = 0, ±1, ±2, ... In addition, \m\ has an upper bound, as we shall see in a moment. 

Just as the factorization of i/>(t,r) into e ~0/ h ) Et ^(r) made it possible to obtain a t- 
independent Schrodinger equation, so the factorization of i(>(r,(j),9) into i/j(r,6) 

makes it possible to obtain a (^-independent Schrodinger equation. This contains another 
real parameter A, over and above m, whose "allowed" values are given by 1(1+ 1 )h 2 , with 
l an integer satisfying 0 < l < n — 1 . The range of possible values for m is bounded by the 
inequality \m\ < l. The possible values of the principal quantum number n, the angular 
momentum quantum number l, and the so-called magnetic quantum number m thus are: 


n = 1 

1 = 0 

rn = 0 

n = 2 

1 = 0 

m = 0 


1 = 1 

m = 0,±1 

n = 3 

1 = 0 

m = 0 


1 = 1 

m = 0,±1 


1 = 2 

m = 0,±1,±2 

n = 4 
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Atomic hydrogen 


Each possible set of quantum numbers n,l,m defines a unique wave function ipni m (t,r), and 
together these make up a complete set of bound-state solutions (E < 0) of the Schrodinger 
equation with V (r) = — e 2 /?\ The following images give an idea of the position probability 
distributions of the first three l = 0 states (not to scale). Below them are the probability 
densities plotted against r. Observe that these states have n — 1 nodes, all of which are 
spherical, that is, surfaces of constant r. (The nodes of a wave in three dimensions are 
two-dimensional surfaces. The nodes of a "probability wave" are the surfaces at which the 
sign of i]) changes and, consequently, the probability density | V ’| 2 vanishes.) 




o 



Figure 47 


Take another look at these images: 
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Figure 48 Image : Orbital s2.png\ 



Figure 49 Image : Orbit.alsA.png\ 
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Figure 50 p 2p o 



Figure 51 p 3p0 
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Figure 52 p 3d0 
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Figure 53 p 4p0 



Figure 54 Image : Orbital s6.png\ 
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Figure 55 Image : Orbital s8.png\ 



Figure 56 p M0 
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Figure 57 p 4f0 



Figure 58 p 5d0 


The Schrodinger equation: implications and applications 



& % 

S3 



Figure 59 p 5/0 


The letters s,p,d,f stand for 1=0, 1,2, 3, respectively. (Before the quantum-mechanical origin 
of atomic spectral lines was understood, a distinction was made between "sharp," "principal," 
"diffuse," and "fundamental" lines. These terms were subsequently found to correspond to 
the first four values that l can take. From / = 3 onward the labels follows the alphabet: 
f,g,h...) Observe that these states display both spherical and conical nodes, the latter being 
surfaces of constant 9. (The "conical" node with 6 = 0 is a horizontal plane.) These states, 
too, have a total of n— 1 nodes, l of which are conical. 

Because the "waviness" in <f> is contained in a phase factor e im ' it does not show up in 
representations of To make it visible, it is customary to replace e im ^ by its real part 
cos (rmfr), as in the following images, which do not represent probability distributions. 
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Figure 60 p^fi 



Figure 61 / 54 J 3 
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Figure 63 p 5 / 2 
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Figure 64 p 5f3 



Figure 65 p 5g i 
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Figure 67 p 5g3 


The total number of nodes is again n— 1, the total number of non-spherical nodes is again Z, 
but now there are m plane nodes containing the z axis and l — m conical nodes. 

What is so special about the z axis? Absolutely nothing, for the wave functions ipnimi which 
are defined with respect to a different axis, make up another complete set of bound-state 
solutions. This means that every wave function ip , n i m can be written as a linear combination 
of the functions i^nlm, and vice versa. 

8 


4.6 Observables and operators 


Remember the mean values 


8 http : //en . wikibooks . org/wiki/Category"/ 0 3A 
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(x) = / \ip\ 2 xdx and (p) = K(k) = / \ip\ 2 hkdk. 


As noted already, if we define the operators 
x = x ("multiply with x ") and p = 
then we can write 


(x) = j ip* x ip dx and 


(P> 



By the same token, 


/• * d 

{E) = ip* Eipdx with E = ih—. 

Which observable is associated with the differential operator d/dcpl If r and 6 are constant 
(as the partial derivative with respect to (p requires), then z is constant, and 


dip 
d <p 


dy dip dx dip 

d<p dy d(p dx 


Given that x 


r sin 9 cos (p and y 


?’sin0sin0, this works out at x 


di/> 

dy 



or 


it) jh = xpy-yp x . 

Since, classically, orbital angular momentum is given by L = r x p, so that L z = xp y — yp x , 
it seems obvious that we should consider xp y — yp x as the operator l z associated with the 
z component of the atom's angular momentum. 

Yet we need to be wary of basing quantum-mechanical definitions on classical ones. Here 
are the quantum-mechanical definitions: 

Consider the wave function ip(qk,t ) of a closed system S with K degrees of freedom. Suppose 
that the probability distribution \ip(qk,t)\ 2 (which is short for \ip(qi, . . . ,qxd)\ 2 ) is invariant 
under translations in time: waiting for any amount of time r makes no difference to it: 


\ip(qk,t)\ 2 = \ip(q k ,t + T)\ 2 . 

Then the time dependence of ip is confined to a phase factor e ia ^ qk,t \ 

Further suppose that the time coordinate t and the space coordinates q\~ are homogeneous — 
equal intervals are physically equivalent. Since S is closed, the phase factor cannot 

then depend on , and its phase can at most linearly depend on t : waiting for 2 r should 
have the same effect as twice waiting for r. In other words, multiplying the wave function 
by e* a ( 2r ) should have same effect as multiplying it twice by e * Q ( T ): 
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gia(2r ) 


j e ia(r)j2 _ e i2a{r) 


Thus 


ip(qk,t) = qk ) e lwt = ip(qk) e 

So the existence of a constant ("conserved") quantity i o or (in conventional units) E is 
implied for a closed system, and this is what we mean by the energy of the system. 

Now suppose that \ip(qk,t)\ 2 is invariant under translations in the direction of one of the 
spatial coordinates qk, say qy. 

\^{qj,qk^j,t )\ 2 = \^(qj + n,qk^j,t)\ 2 . 

Then the dependence of i/j on qj is confined to a phase factor 

And suppose again that the time coordinates t and qk are homogeneous. Since S is closed, 
the phase factor e l P( qk ’ t ' ) cannot then depend on q^j or t, and its phase can at most linearly 
depend on q 3 : translating S by 2 k should have the same effect as twice translating it by k. In 
other words, multiplying the wave function by e *d( 2K ) should have same effect as multiplying 
it twice by e Z P( K ): 


e i/3{ 2k) _ j-gijSf/t) j2 _ e i2/3(K) ' 


Thus 


^(qk,t)=^(q k¥ , j ,t)e ik ^ =^{q k ^,t)e^ h ^ q K 

So the existence of a constant ("conserved”) quantity kj or (in conventional units) pj is 
implied for a closed system, and this is what we mean by the j-component of the system's 
momentum. 

You get the picture. Moreover, the spatial coordiates might as well be the spherical 
coordinates r,9,(f). If \i/j(r,9,(j),t) | 2 is invariant under rotations about the z axis, and if the 
longitudinal coordinate <f> is homogeneous, then 


V>(r, 6, <t>, t) = V>(r, 9, t ) e im<t> = i/>(r,9,t) 

In this case we call the conserved quantity the z component of the system's angular momen- 
tum. 

Now suppose that O is an observable, that O is the corresponding operator, and that V’q v 
satisfies 


0^6,v = v ^6,v 
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We say that v is an eigenfunction or eigenstate of the operator O, and that it has the 
eigenvalue v. Let's calculate the mean and the standard deviation of O for V’q We obviously 
have that 


(°) = J ^6, v° $6, v dx = I ^6,v V ^6,v dx = V J \^d,v\ 2dx = v - 


Hence 


AO = \ I ft ~ V ^°- ' l b,v dx = °> 

since (O — v)'i/iq v = 0. For a system associated with '^q v , O is dispersion-free. Hence the 
probability of finding that the value of O lies in an interval containing v, is 1. But we have 
that 


E'if(q k )e-^ i/h)Et = Eif(q k )e-( i/h)Et 


Pj Mq&j , t) eW®* ^ = Pj i>{q k+j , t) e^* ^ 


l z if(r, 0,t) e (i/K) l ^ = l z -if(r,6, t) e (i/K) 1 

So, indeed, l z is the operator associated with the z component of the atom's angular 
momentum. 

Observe that the eigenfunctions of any of these operators are associated with systems for 
which the corresponding observable is "sharp": the standard deviation measuring its fuzziness 
vanishes. 

For obvious reasons we also have 


/ hi d d 

lx = -in 2/w 

oz ay 


and l v = —ih[z— x— 

' ox az 


If we define the commutator [A,B] = AB — BA , then saying that the operators A and B 
commute is the same as saying that their commutator vanishes. Later we will prove that two 
observables are compatible (can be simultaneously measured) if and only if their operators 
commute. 

Exercise: Show that [l x ,ly] =ihl z . 

One similarly finds that [l y ,l z \ = M x and [l z ,l x \ = ihl y . The upshot: different components of 
a system's angular momentum are incompatible. 

Exercise: Using the above commutators, show that the operator L 2 = + ^ + commutes 

with l x , l y , and l z . 
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9 


4.7 Beyond hydrogen: the Periodic Table 


If we again assume that the nucleus is fixed at the center and ignore relativistic and spin 
effects, then the stationary states of helium are the solutions of the following equation: 


dip H 2 d 2 i i t> d 2 ip d 2 ip d 2 ip d 2 ip 8 2 ip 

dt 2 m dx 2 dy 2 dz 2 dx 2 dy 2 dz 2 


2e 2 2e 2 

n r 2 



r 12 


The wave function now depends on six coordinates, and the potential energy V is made up 
of three terms, rq = \Jx 2 + y 2 + z 2 and r 2 = \jx\ + y\ + z 2 are associated with the respective 
distances of the electrons from the nucleus, and ?q 2 = \J (x 2 — aq) 2 + (y 2 —yi) 2 + (z 2 — z\) 2 is 
associated with the distance between the electrons. Think of e 2 /?q 2 as the value the potential 
energy associated with the two electrons would have if they were at rq and r 2 , respectively. 

Why are there no separate wave functions for the two electrons? The joint probability of 
finding the first electron in a region A and the second in a region B (relative to the nucleus) 
is given by 


p(A,B) = f d 3 ?q [ d 3 r 2 |'i/’(ri,r 2 )| 2 . 

Ja Jb 

If the probability of finding the first electron in A were independent of the whereabouts of 
the second electron, then we could assign to it a wave function ^q(ri), and if the probability 
of finding the second electron in B were independent of the whereabouts of the first electron, 
we could assign to it a wave function 4> 2 (r 2 ). In this case ^(ri,r 2 ) would be given by the 
product iI’i(xi)iI) 2 {y 2 ) of the two wave functions, and p(A,B) would be the product of 
p(A) = J A d 3 r i |y>(ri) | 2 and p{B) = f B d 3 r 2 \ip(r 2 )\ 2 . But in general, and especially inside a 
helium atom, the positional probability distribution for the first electron is conditional on 
the whereabouts of the second electron, and vice versa, given that the two electrons repel 
each other (to use the language of classical physics). 

For the lowest energy levels, the above equation has been solved by numerical methods. 
With three or more electrons it is hopeless to look for exact solutions of the corresponding 
Schrodinger equation. Nevertheless, the Periodic Table 10 and many properties of the chemical 
elements can be understood by using the following approximate theory. 

First, we disregard the details of the interactions between the electrons. Next, since the 
chemical properties of atoms depend on their outermost electrons, we consider each of these 
atoms subject to a potential that is due to (i) the nucleus and (ii) a continuous, spherically 
symmetric, charge distribution doing duty for the other electrons. We again neglect spin 
effects except that we take account of the Pauli exclusion principle 11 , according to which 


9 http : //en . wikibooks . org/wiki/Category"/ 0 3A 

10 http : //en. Wikipedia. org/wiki/Periodic'/ 0 20table 

11 http : //en. Wikipedia. org/wiki/Pauli"/o20exclusiony o 20principle 
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the probability of finding two electrons (more generally, two fermions 12 ) having exactly the 
same properties is 0. Thus two electrons can be associated with exactly the same wave 
function provided that their spin states differ in the following way: whenever the spins of 
the two electrons are measured with respect to a given axis, the outcomes are perfectly 
anticorrelated; one will be "up" and the other will be "down". Since there are only two 
possible outcomes, a third electron cannot be associated with the same wave function. 

This approximate theory yields stationary wave functions V’nZm( r ) called orbitals 13 for 
individual electrons. These are quite similar to the stationary wave functions one obtains 
for the single electron of hydrogen, except that their dependence on the radial coordinate 
is modified by the negative charge distribution representing the remaining electrons. As a 
consequence of this modification, the energies associated with orbitals with the same quantum 
number n but different quantum numbers l are no longer equal. For any given n> 1, obitals 
with higher l yield a larger mean distance between the electron and the nucleus, and the 
larger this distance, the more the negative charge of the remaining electrons screens the 
positive charge of the nucleus. As a result, an electron with higher l is less strongly bound 
(given the same n), so its ionization energy 14 is lower. 

Chemists group orbitals into shells 15 according to their principal quantum number. As we 
have seen, the n-th shell can "accommodate" up to n 2 x 2 electrons. Helium has the first 
shell completely "filled" and the second shell "empty." Because the helium nucleus has twice 
the charge of the hydrogen nucleus, the two electrons are, on average, much nearer the 
nucleus than the single electron of hydrogen. The ionization energy of helium is therefore 
much larger, 2372.3 J/mol 16 as compared to 1312.0 J/mol for hydrogen. On the other hand, 
if you tried to add an electron to create a negative helium ion, it would have to go into the 
second shell, which is almost completely screened from the nucleus by the electrons in the 
first shell. Helium is therefore neither prone to give up an electron not able to hold an extra 
electron. It is chemically inert, as are all elements in the rightmost column of the Periodic 
Table. 

In the second row of the Periodic Table the second shell gets filled. Since the energies of the 
2p orbitals are higher than that of the 2s orbital, the latter gets "filled" first. With each added 
electron (and proton!) the entire electron distribution gets pulled in, and the ionization 
energy goes up, from 520.2 J/mol for lithium (atomic number Z=3) to 2080.8 J/mol for neon 
(Z=10). While lithium readily parts with an electron, fluorine (Z=9) with a single empty 
"slot" in the second shell is prone to grab one. Both are therefore quite active chemically. 
The progression from sodium (Z=ll) to argon (Z=18) parallels that from lithium to neon. 

There is a noteworthy peculiarity in the corresponding sequences of ionization energies 17 : 
The ionization energy of oxygen (Z=8, 1313.9 J/mol) is lower than that of nitrogen (Z=7, 
1402.3 J/mol), and that of sulfur (Z=16, 999.6 J/mol) is lower than that of phosphorus 
(Z=15, 1011.8 J/mol). To understand why this is so, we must take account of certain details 
of the inter-electronic forces that we have so far ignored. 


12 http : //en . Wikipedia . org/wiki/Fermion 

13 http : //en . Wikipedia . org/wiki/Atomic’/ 0 20orbital 

14 http : //en . Wikipedia . org/wiki/Ionization'/ 0 20potential 

15 http : //en . Wikipedia . org/wiki/Electron'/ 0 20shell 

16 http : //en . Wikipedia . org/wiki/ Jouley o 20per"/o20mole 

17 http : //en . Wikipedia . org/wiki/Ionizationy 0 20energies"/ 0 20of y o 20they o 20elements 
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Suppose that one of the two 2p electrons of carbon (Z=6) goes into the m=0 orbital with 
respect to the z axis. Where will the other 2p electron go? It will go into any vacant orbital 
that minimizes the repulsion between the two electrons, by maximizing their mean distance. 
This is neither of the orbitals with |m|=l with respect to the z axis but an orbital with 
m=0 with respect to some axis perpendicular to the z axis. If we call this the x axis, then 
the third 2p electron of nitrogen goes into the orbital with m = 0 relative to y axis. The 
fourth 2p electron of oxygen then has no choice but to go — with opposite spin — into an 
already occupied 2p orbital. This raises its energy significantly and accounts for the drop in 
ionization from nitrogen to oxygen. 

By the time the 3p orbitals are "filled," the energies of the 3d states are pushed up so high 
(as a result of screening) that the 4s state is energetically lower. The "filling up" of the 
3d orbitals therefore begins only after the 4s orbitals are "occupied," with scandium (Z=21). 

Thus even this simplified and approximate version of the quantum theory of atoms has the 
power to predict the qualitative and many of the quantitative features of the Period Table. 
18 

4.8 Probability flux 

The time rate of change of the probability density p(t, r) = \ip(t,r)\ 2 (at a fixed location r) 
is given by 


dp * dip di )* 

m=1’ 


With the help of the Schrodinger equation and its complex conjugate, 


..dip 1 (h d \ fh d , 

,h to = *n. (i Br - A ) '(I Br - A ],p+VA 




one obtains 


% , 

+ H' P 


,1 (a " a j (mp- -A)r+vr 

2m V dr J V dr 


The terms containing V cancel out, so we are left with 


dp 

dt 


2mh 


^ ( ih ^r + A ) • ( ih j; + A ) V’ - V’ ~ a) • (ih^ -A) ip* 


18 http : //en . wikibooks . org/wiki/Category"/ 0 3A 
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k ( d 2 ip d 2 ip*\ 1/ (9 dp’* 

X : -^-TT^r +— [tplp — • A + A— — ?/> + Alp— — 

2mi \ o r z a r z / mV or or or 


Next, we calculate the divergence of j = ^ ( ' pp - ^rV’) - 


d . HI d 2 ip d 2 ip* 

dr ^ 2 mi [ dr 2 dr 2 


l/,.*<9 d r p ch/T 

— V’V’ tw-a+a-^V’ + a V , wv- 

m. V <9r ar ar 


The upshot: 

_ _ _9_ . ; 

Sf c>r J ' 


Integrated over a spatial region R with unchanging boundary dR : 

§ih pd3r =-I R i- id3r - 

According to Gauss's law 19 , the outward flux of j through dR equals the integral of the 
divergence 20 of j over R : 


j -dS= [ l-'j d\. 

dR Jr dr 


We thus have that 


f, [ pd 3 r = - (f j-dE. 
at 9 _r, 

If p is the continuous density of some kind of stuff (stuff per unit volume) and j is its flux 
(stuff per unit area per unit time), then on the left-hand side we have the rate at which the 
stuff inside R increases, and on the right-hand side we have the rate at which stuff enters 
through the surface of R. So if some stuff moves from place A to place B, it crosses the 
boundary of any region that contains either A or B. This is why the framed equation is 
known as a continuity equation 21 . 

In the quantum world, however, there is no such thing as continuously distributed and/or 
continuously moving stuff, p and j, respectively, are a density (something per unit volume) 
and a flux (something per unit area per unit time) only in a formal sense. If ip is the 
wave function associated with a particle, then the integral J R pd 3 r = f R \ip \ 2 d 3 r gives the 
probability of finding the particle in R if the appropriate measurement is made , and the 
framed equation tells us this: if the probability of finding the particle inside R, as a function 
of the time at which the measurement is made, increases, then the probability of finding the 
particle outside R, as a function of the same time, decreases by the same amount. (Much 


19 http : //en . Wikipedia . org/wiki/Gauss"/o27sy o 201aw 

20 http : //en . Wikipedia . org/wiki/Divergence 

21 http : //en . Wikipedia . org/wiki/Continuity"/ 0 20equation 
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the same holds if is associated with a system having n degrees of freedom and R is a 
region of the system's configuration space.) This is sometimes expressed by saying that 
"probability is (locally) conserved." When you hear this, then remember that the probability 
for something to happen in a given place at a given time isn't anything that is situated at 
that place or that exists at that time. 

22 


22 http : //en . wikibooks . org/wiki/Category"/ 0 3A 


86 



5 Entanglement (a preview) 


5.1 Bell's theorem: the simplest version 


Quantum mechanics permits us to create the following scenario. 

• Pairs of particles are launched in opposite directions. 

• Each particle is subjected to one of three possible measurements ( 1 , 2 , or 3 ). 

• Each time the two measurements are chosen at random. 

• Each measurement has two possible results, indicated by a red or green light. 
Here is what we find: 


• If both particles are subjected to the same measurement, identical results are never 
obtained. 

• The two sequences of recorded outcomes are completely random. In particular, half of 
the time both lights are the same color. 




If this doesn't bother you, then please explain how it is that the colors differ whenever 
identical measurements are performed! 

The obvious explanation would be that each particle arrives with an "instruction set" - 
some property that pre-determines the outcome of every possible measurement. Let's see 
what this entails. 

Each particle arrives with one of the following 23 = 8 instruction sets: 

RRR,RRG,RGR,GRR,RGG,GRG,GGR, or GGG. 

(If a particle arrives with, say, RGG, then the apparatus flashes red if it is set to 1 and 
green if it is set to 2 or 3 .) In order to explain why the outcomes differ whenever both 
particles are subjected to the same measurement, we have to assume that particles launched 
together arrive with opposite instruction sets. If one carries the instruction (or arrives with 
the property denoted by) RRG, then the other carries the instruction GGR. 
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Suppose that the instruction sets are RRG and GGR. In this case we observe different 
colors with the following five of the 32 = 9 possible combinations of apparatus settings: 

1 1,2 2,3— 3,1 2, and 2—1, 

and we observe equal colors with the following four: 

1—3,2— 3,3— 1, and 3—2. 

Because the settings are chosen at random, this particular pair of instruction sets thus 
results in different colors 5/9 of the time. The same is true for the other pairs of instruction 
sets except the pair RRR, GGG. If the two particles carry these respective instruction sets, 
we see different colors every time. It follows that we see different colors at least 5/9 of the 
time. 

But different colors are observed half of the time! In reality the probability of observing 
different colors is 1/2. Conclusion: the statistical predictions of quantum mechanics cannot 
be explained with the help of instruction sets. In other words, these measurements do 
not reveal pre-existent properties. They create the properties the possession of which they 
indicate. 

Then how is it that the colors differ whenever identical measurements are made? How does 
one apparatus "know" which measurement is performed and which outcome is obtained by 
the other apparatus ? 

Whenever the joint probability p(A,B) of the respective outcomes A and B of two measure- 
ments does not equal the product p(A) p(B) of the individual probabilities, the outcomes 
- or their probabilities — are said to be correlated. With equal apparatus settings we 
have p(R,R) = p(G,G) = 0, and this obviously differs from the products p(R) p(R) 
and P(G) p(G), which equal \ x | What kind of mechanism is responsible for the 
correlations between the measurement outcomes? 

You understand this as much as anybody else! 

The conclusion that we see different colors at least 5/9 of the time is Bell's theorem (or 
Bell's inequality ) for this particular setup. The fact that the universe violates the logic of 
Bell's Theorem is evidence that particles do not carry instruction sets embedded within 
them and instead have instantaneous knowledge of other particles at a great distance. Here 
is a comment by a distinguished Princeton physicist as quoted by David Merrnin 1 

Anybody who's not bothered by Bell's theorem has to have rocks in his head. 

And here is why Einstein wasn't happy with quantum mechanics: 

I cannot seriously believe in it because it cannot be reconciled with the idea that physics 
should represent a reality in time and space, free from spooky actions at a distance. 2 

Sadly, Einstein (1879 - 1955) did not know Bell's theorem of 1964. We know now that 


1 N. David Mermin, "Is the Moon there when nobody looks? Reality and the quantum theory," Physics 
Today , April 1985. The version of Bell's theorem discussed in this section first appeared in this article. 

2 Albert Einstein, The Bom-Einstein Letters, with comments by Max Born (New York: Walker, 1971). 
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there must be a mechanism whereby the setting of one measurement device can influence 
the reading of another instrument, however remote. 3 4 

Spooky actions at a distance are here to stay! 

4 


5.2 A quantum game 

Here are the rules: 5 6 

• Two teams play against each other: Andy, Bob, and Charles (the "players") versus the 
"interrogators". 

• Each player is asked either "What is the value of X?" or "What is the value of Y?" 

• Only two answers are allowed: +1 or —1. 

• Either each player is asked the X question, or one player is asked the X question and the 
two other players are asked the Y question. 

• The players win if the product of their answers is —1 in case only X questions are asked, 
and if the product of their answers is +1 in case Y questions are asked. Otherwise they 
lose. 

• The players are not allowed to communicate with each other once the questions are asked. 
Before that, they are permitted to work out a strategy. 

Is there a failsafe strategy? Can they make sure that they will win? Stop to ponder the 
question. 

Let us try pre-agreed answers, which we will call XA, XB, XC and YA, YB, YC. The 
winning combinations satisfy the following equations: 

x a y b y c = 1, y a x b y c = 1, y a y b x c = 1, x A x B x c = - 1. 

Consider the first three equations. The product of their right-hand sides equals +1. The 
product of their left-hand sides equals XAXBXC, implying that XAXBXC = 1. (Re- 
member that the possible values are ±1.) But if XAXBXC = 1, then the fourth equation 
XAXBXC = — 1 obviously cannot be satisfied. 

The bottom line: There is no failsafe strategy with pre-agreed answers. 

6 


3 John S. Bell, "On the Einstein Podolsky Rosen paradox," Physics 1, pp. 195-200, 1964. 

4 http : //en . wikibooks . org/wiki/Category"/ 0 3A 

5 Lev Vaidman, "Variations on the theme of the Greenberger-Horne-Zeilinger proof," Foundations of 
Physics 29, pp. 615-30, 1999. 

6 http : //en . wikibooks . org/wiki/Category"/ 0 3A 
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5.3 The experiment of Greenberger, Horne, and Zeilinger 

And yet there is a failsafe strategy. 7 

Here goes: 

• Andy, Bob, and Charles prepare three particles (for instance, electrons) in a particular 
way. As a result, they are able to predict the probabilities of the possible outcomes of 
any spin measurement to which the three particles may subsequently be subjected. In 
principle these probabilities do not depend on how far the particles are apart. 

• Each player takes one particle with him. 

• Whoever is asked the X question measures the x component of the spin of his particle and 
answers with his outcome, and whoever is asked the Y question measures the y component 
of the spin of his particle and answers likewise. (All you need to know at this point about 
the spin of a particle is that its component with respect to any one axis can be measured, 
and that for the type of particle used by the players there are two possible outcomes, 
namely +1 and —1. 

Proceeding in this way, the team of players is sure to win every time. 

Is it possible for the x and y components of the spins of the three particles to be in possession 

of values before their values are actually measured? 

Suppose that the y components of the three spins have been measured. The three equations 


X A Y B Y C = 1, Y a X b Yc= 1, Y A Y B Xc = 1 

of the previous section 8 tell us what we would have found if the x component of any one 
of the three particles had been measured instead of the y component. If we assume that 
the x components are in possession of values even though they are not measured, then their 
values can be inferred from the measured values of the three y components. 

Try to fill in the following table in such a way that 

• each cell contains either +1 or —1, 

• the product of the three X values equals —1, and 

• the product of every pair of Y values equals the remaining X value. 

Can it be done? 



A 

B 

C 

X 




Y 





The answer is negative, for the same reason that the four equations 


7 D. M. Greenberger, M. A. Horne, and A. Zeilinger, "Going beyond Bell's theorem," in Bell's theorem, 
Quantum Theory, and Conception of the Universe, edited by M. Kafatos (Dordrecht: Kluwer Academic, 
1989), pp. 69-72. 

8 Chapter 5.1 on page 89 
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X A Y B Y C = 1, Y a X r Y c = 1, Y A Y B X C = 1, X A X B X C = - 1 

cannot all be satisfied. Just as there can be no strategy with pre-agreed answers, there 
can be no pre-existent values. We seem to have no choice but to conclude that these spin 
components are in possession of values only if (and only when) they are actually measured. 

Any two outcomes suffice to predict a third outcome. If two x components are measured, the 
third x component can be predicted, if two y components are measured, the x component of 
the third spin can be predicted, and if one x and one y component are measurement, the 
y component of the third spin can be predicted. How can we understand this given that 

• the values of the spin components are created as and when they are measured, 

• the relative times of the measurements are irrelevant, 

• in principle the three particles can be millions of miles apart. 

How does the third spin "know" which components of the other spins are measured and 
which outcomes are obtained? What mechanism correlates the outcomes? 

You understand this as much as anybody else! 

9 


9 http : //en . wikibooks . org/wiki/Category°/ 0 3A 
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7 Appendix 


7.1 Probability 

7.1.1 Basic Concepts 

Probability is a numerical measure of likelihood. If an event has a probability equal to 1 (or 
100%), then it is certain to occur. If it has a probability equal to 0, then it will definitely not 
occur. And if it has a probability equal to 1/2 (or 50%), then it is as likely as not to occur. 

You will know that tossing a fair coin has probability 1 /2 to yield heads, and that casting a 
fair die has probability 1/6 to yield a 1. How do we know this? 

There is a principle known as the principle of indifference, which states: if there are n 
mutually exclusive and jointly exhaustive possibilities, and if, as far as we know, there are 
no differences between the n possibilities apart from their names (such as "heads" or "tails"), 
then each possibility should be assigned a probability equal to 1/n. ( Mutually exclusive: only 
one possibility can be realized in a single trial. Jointly exhaustive : at least one possibility is 
realized in a single trial. Mutually exclusive and jointly exhaustive: exactly ony possibility is 
realized in a single trial.) 

Since this principle appeals to what we know, it concerns epistemic probabilities (a.k.a. 
subjective probabilities) or degrees of belief. If you are certain of the truth of a proposition, 
then you assign to it a probability equal to 1. If you are certain that a proposition is false, 
then you assign to it a probability equal to 0. And if you have no information that makes 
you believe that the truth of a proposition is more likely (or less likely) than its falsity, 
then you assign to it probability 1/2. Subjective probabilities are therefore also known as 
ignorance probabilities: if you are ignorant of any differences between the possibilities, you 
assign to them equal probabilities. 

If we assign probability 1 to a proposition because we believe that it is true, we assign a 
subjective probability, and if we assign probability 1 to an event because it is certain that it 
will occur, we assign an objective probability. Until the advent of quantum mechanics, the 
only objective probabilities known were relative frequencies. 

The advantage of the frequentist definition of probability is that it allows us to measure 
probabilities, at least approximately. The trouble with it is that it refers to ensembles. You 
can't measure the probability of heads by tossing a single coin. You get better and better 
approximations to the probability of heads by tossing a larger and larger number N of coins 
and dividing the number Nh of heads by N. The exact probability of heads is the limit 


p(H) 


lim 

TV— >■ oo 


Nh 

N ' 
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The meaning of this formula is that for any positive number e, however small, you can find 
a (sufficiently large but finite) number N such that 


p(H) 


Nh 

N 


< e. 


The probability that m events from a mutually exclusive and jointly exhaustive set of 
n possible events happen is the sum of the probabilities of the m events. Suppose, for 
example, you win if you cast either a 1 or a 6. The probability of winning is 


P{ 1 or 6) = p(l)+p(6) = - + - = -• 

bod 

In frequentist terms, this is virtually self-evident. N(l)/N approximates p( 1), N(6)/N 
approximates p( 6), and [IV(1) + N(6)]/N approximates p( 1 or 6). 

The probability that two independent events happen is the product of the probabilities of 
the individual events. Suppose, for example, you cast two dice and you win if the total is 12. 
Then 


P { 6 and 6) = p( 6) x p( 6) = \ x \ = ^ • 

6 6 db 

By the principle of indifference, there are now 6 x 6 = 36 equiprobable possibilities, and 
casting a total of 12 with two dice is one of them. 

It is important to remember that the joint probability p(A,B) = p(A and B) of two events 
A,B equals the product of the individual probabilities p(A) and p(B) only if the two events 
are independent, meaning that the probability of one does not depend on whether or not 
the other happens. In terms of propositions: the probability that the conjunction Pi and Pz 
is true is the probability that P\ is true times the probability that Pz is true only if the 
probability that either proposition is true does not depend on whether the other is true or 
false. Ignoring this can have the most tragic consequences 1 . 

The general rule for the joint probability of two events is 


p(A,B) = p(B\A)p(A)=p(A\B)p(B). 

p(B\A) is a conditional probability : the probability of B given that A. 

To see this, let N(A,B ) be the number of trials in which both A and B happen or are 
true. N(A,B)/N approximates p(A,B), N (A, B) / N (A) approximates p(B\A), and N(A)/N 
approximates p(A). But 


p(A,B) 


AT-s-oo N(A,B) 
T N 


N(A,B) 

N(A) 


N(A) N^oc 

~lsi 7 


p(B\A)p(A). 


1 http : //en . Wikipedia. org/wiki/Sally"/ 0 20Clark 
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An immediate consequence of this is Bayes ' theorem: 


p(B\A) 


p(A\B) 

p(A) 


p(B). 


The following is just as readily established: 


p(X) = p(X\Y)p(Y) +p(X\Y)p(Y), 

where Y happens or is true whenever Y does not happen or is false. The generalization to 
n > 2 mutually exclusive and jointly exhaustive possibilities should be obvious. 

Given a random variable, which is a set X = {aq, . . . ,x n } of random numbers, we may want 
to know the arithmetic mean 


(X) = 


1 

n 


n 

J2 x k = 

k= 1 


Xi H hx n 

n 


as well as the standard deviation, which is the root-mean-square deviation from the arithmetic 
mean, 


<X) 


N 


i 

n 


n 


E(^-W) 2 - 

k = 1 


The standard deviation is an important measure of statistical dispersion. 

Given n possible measurement outcomes v\ v n with probabilities pk =p(vk), we have 
a probability distribution {pi,. . ■ ,p n }, and we may want to know the expected value of X, 
defined by 


n 

{X) = J2Pk x k 

k=l 


as well as the corresponding standard deviation 


u(X) 


n 


N 


J2Pk( x k~ (X)) 2 , 

k = 1 


which is a handy measure of the fuzziness of X. 

We have defined probability as a numerical measure of likelihood. So what is likelihood? 
What is probability apart from being a numerical measure? The frequentist definition covers 
some cases, the epistemic definition covers others, but which definition would cover all cases? 
It seems that probability is one of those concepts that are intuitively meaningful to us, but - 
just like time or the experience of purple — cannot be explained in terms of other concepts. 
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7.1.2 Some Problems 

Problem 1 (Monty Hall). A player in a game show is given the choice of three doors. 
Behind one door is the Grand Prize (say, a car); behind the other two doors are booby 
prizes (say, goats). The player picks a door, and the show host peeks behind the doors and 
opens one of the remaining doors. There is a booby prize behind the door he opened. The 
host then offers the player either to stay with the door that was chosen at the beginning, or 
to switch to the other closed door. What gives the player the better chance of winning: to 
switch doors or to stay with the original choice? Or are the chances equal? 

Problem 2. Imagine you toss a coin successively and wait till the first time the pattern 
HTT appears. For example, if the sequence of tosses was 

HHTHHTHHTTHHTTTHTH 

then the pattern HTT would appear after the 10th toss. Let A(HTT) be the average number 
of tosses until HTT occurs, and let A(HTH) be the average number of tosses until HTH 
occurs. Which of the following is true? 

(a) A( HTH) < ( HTT), (b) A(HTH) = A(HTT), or (c) A(HTH) > A(HTT). 

Problem 3. Imagine a test for a certain disease (say, HIV) that is 99% accurate. And 
suppose a person picked at random tests positive. What is the probability that the person 
actually has the disease? 


Solutions 

Problem 1. Let p(Cl) be the probability that the car is behind door 1, p(0 3) the probability 
that the host opens door 3, and p(03\Cl) the probability that the host opens door 3 given 
that the car is behind door 1. We have 

p{03) = p{03\Cl)p{Cl)+p{03\C2)p(C2) + p{03\C3)p{C3) 


as well as 


p{03\C2)p(C2)=p{C2\03)p{03). 

If the first choice is door 1, then p(03\Cl) = 1/2, p(03\C2) = 1, and p(03\C3) = 0. Hence 

p{03) = -x- + lx- + 0x- = - 

' 2 3 3 3 2 


and thus 


2 http : //en . wikibooks . org/wiki/Category"/ 0 3A 
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p(C2\OS) 


p(Q3\C2)p{C2) 

p(03) 



2 

3' 


In words: If the player's first choice is door 1 and the host opens door 3, then the probability 
that the car is behind door 2 is 2/3, whereas the probability that it is behind door 1 is 1 — 
2/3 = 1/3. A quicker way to see that switching doubles the chances of winning is to compare 
this game with another one, in which the show host offers the choice of either opening the 
originally chosen door or opening both other doors (and winning regardless of which, if any, 
has the car). 

Note: This result depends on the show host * deliberately* opening only a door with a goat 
behind it. If she doesn't know - or doesn't care (!) - which door the car is behind, and opens 
a remaining door at random, then 1/3 of the outcomes that were initially possible have 
been removed by her having opened a door with a goat. In this case the player gains no 
advantage (or disadvantage) by switching. So the answer depends on the rules of the game, 
not just the sequence of events. Of course the player may not know what the 'rules' are in 
this respect, in which case he should still switch doors because there can be no disadvantage 
in doing so. 

Problem 2. The average number of tosses until HTT occurs, A(HTT), equals 8, whereas 
A(HTH) = 10. To see why the latter is greater, imagine you have tossed HT. If you are 
looking for HTH and the next toss gives you HTT, then your next chance to see HTH is 
after a total of 6 tosses, whereas if you are looking for HTT and the next toss gives you 
HTH, then your next chance to see HTT is after a total of 5 tosses. 

Problem 3. The answer depends on how rare the disease is. Suppose that one in 10,000 has 
it. This means 100 in a million. If a million are tested, there will be 99 true positives and one 
false negative. 99% of the remaining 999,900 — that is, 989,901 — will yield true negatives 
and 1% — that is, 9,999 — will yield false positives. The probability that a randomly picked 
person testing positive actually has the disease is the number of true positives divided by 
the number of positives, which in this particular example is 99/(9999+99) = 0.0098 — less 
than 1%! 


7.1.3 Moral 

Be it scientific data or evidence in court — there are usually competing explanations, and 
usually each explanation has a likely bit and an unlikely bit. For example, having the disease 
is unlikely, but the test is likely to be correct; not having the disease is likely, but a false 
test result is unlikely. You can see the importance of accurate assessments of the likelihood 
of competing explanations, and if you have tried the problems, you have seen that we aren't 
very good at such assessments. 

3 
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7.2 Mathematical tools 

7.2.1 Elements of calculus 
A definite integral 

Imagine an object O that is free to move in one dimension — say, along the x axis. Like 
every physical object, it has a more or less fuzzy position (relative to whatever reference 
object we choose). For the purpose of describing its fuzzy position, quantum mechanics 
provides us with a probability density p(x). This depends on actual measurement outcomes, 
and it allows us to calculate the probability of finding the particle in any given interval of 
the x axis, provided that an appropriate measurement is made. (Remember our mantra: 
the mathematical formalism of quantum mechanics serves to assign probabilities to possible 
measurement outcomes on the basis of actual outcomes.) 



We call p(x) a probability density because it represents a probability per unit length. The 
probability of finding O in the interval between x\ and X2 is given by the area A between 
the graph of p(x), the x axis, and the vertical lines at x\ and x'2, respectively. How do we 
calculate this area? The trick is to cover it with narrow rectangles of width Ax. 
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The area of the first rectangle from the left is p(x i + Ax) Ax, the area of the second is 
p(x i + 2 Ax) Ax, and the area of the last is p(x\ + 12 Ax) Ax. For the sum of these areas we 
have the shorthand notation 


12 

p(x + k Ax) Ax. 

k = 1 


It is not hard to visualize that if we increase the number N of rectangles and at the same 
time decrease the width Ax of each rectangle, then the sum of the areas of all rectangles 
fitting under the graph of p(x) between x\ and X 2 gives us a better and better approximation 
to the area A and thus to the probability of finding O in the interval between x\ and X 2 - 
As Ax tends toward 0 and N tends toward infinity (oo), the above sum tends toward the 
integral 



p(x) dx. 


We sometimes call this a definite integral to emphasize that it's just a number. (As you can 
guess, there are also indefinite integrals, about which more later.) The uppercase delta has 
turned into a d indicating that dx is an infinitely small (or infinitesimal ) width, and the 
summation symbol (the uppercase sigma) has turned into an elongated S indicating that we 
are adding infinitely many infinitesimal areas. 
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Don't let the term ''infinitesimal 1 ' scare you. An infinitesimal quantity means nothing by 
itself. It is the combination of the integration symbol / with the infinitesimal quantity dx 
that makes sense as a limit , in which N grows above any number however large, dx (and 
hence the area of each rectangle) shrinks below any (positive) number however small, while 
the sum of the areas tends toward a well-defined, finite number. 

4 


Differential calculus: a very brief introduction 

Another method by which we can obtain a well-defined, finite number from infinitesimal 
quantities is to divide one such quantity by another. 

We shall assume throughout that we are dealing with well-behaved functions, which means 
that you can plot the graph of such a function without lifting up your pencil, and you can 
do the same with each of the function's derivatives. So what is a function, and what is the 
derivative of a function? 


A function /(x) is a machine with an input and an output. Insert a number x and out pops 
the number /(x). Rather confusingly, we sometimes think of /(x) not as a machine that 
churns out numbers but as the number churned out when x is inserted. 



The (first) derivative f(x) of /(x) is a function that tells us how much /(x) increases as x 
increases (starting from a given value of x, say xo) in the limit in which both the increase 
Ax in x and the corresponding increase A / = /(x + Ax) — f(x) in /(x) (which of course 
may be negative) tend toward 0: 


f(x o) 


fim ^ 

Ax->0 Ax 


df_ 

dx 


(*o)- 


The above diagrams illustrate this limit. The ratio A// Ax is the slope of the straight line 
through the black circles (that is, the tan of the angle between the positive x axis and the 
straight line, measured counterclockwise from the positive x axis). As Ax decreases, the 
black circle at x + Ax slides along the graph of /(x) towards the black circle at x, and the 
slope of the straight line through the circles increases. In the limit Ax — > 0, the straight line 


4 http ://en. wikibooks . org/wiki/Category"/,3A 


102 


Mathematical tools 


becomes a tangent on the graph of f(x), touching it at x. The slope of the tangent on f(x) 
at xq is what we mean by the slope of f(x) at xq. 


So the first derivative f'{x) of f(x) is the function that equals the slope of f{x) for every x. 
To differentiate a function / is to obtain its first derivative f . By differentiating f 1 , we obtain 
the second derivative f" = of /, by differentiating f" we obtain the third derivative 

f" = 0’ and so on - 


It is readily shown that if a is a number and / and g are functions of x, then 
and 


d ( a f) =n df 

dx a dx 


d U+g ) _ df_,dg 
dx dx ' dx' 


A slightly more difficult problem is to differentiate the product e = fg of two functions of x. 
Think of / and g as the vertical and horizontal sides of a rectangle of area e. As x increases 
by Ax, the product fg increases by the sum of the areas of the three white rectangles in 
this diagram: 


df 


f 


9 


dg 


Figure 72 


In other "words", 


Ae = f(Ag) + (Af)g + (Af)(Ag) 

and thus 

Ae A^ + A/ AfAg 

Ax Ax Ax 9 Ax 

If we now take the limit in which Ax and, hence, A / and A g tend toward 0, the first two 
terms on the right-hand side tend toward fg' + fg ■ What about the third term? Because it 
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is the product of an expression (either A / or A g) that tends toward 0 and an expression 
(either Ag/Ax or Af /Ax) that tends toward a finite number, it tends toward 0. The bottom 
line: 


e' = (/<?)' = fg' + f'g. 

This is readily generalized to products of n functions. Here is a special case: 

(ry = r~ l /' + r ~ 2 rs + r ~ 3 ff +■■■+/' r ~ x = n r~ i f. 

Observe that there are n equal terms between the two equal signs. If the function / returns 
whatever you insert, this boils down to 


(x n y = nx n -\ 

Now suppose that g is a function of / and / is a function of x. An increase in x by Ax causes 
an increase in / by Af ~ gAx’, and this in turn causes an increase in g by A g ~ ^A /. 

Thus si ~ ^ di ■ the limit Ax — » 0 the « becomes a = : 

dg_ = dg_df_ 
dx df dx 

We obtained (x n )' = ni"^ 1 for integers n > 2. Obviously it also holds for n = 0 and n = 1. 

1. Show that it also holds for negative integers n. Hint: Use the product rule to calculate 
( x n x~ n )' . 

2. Show that (y / x) / = l/(2y / x). Hint: Use the product rule to calculate {^/x^/xy . 

3. Show that ( x n )' = nx n ~ 1 also holds for n = 1/m where m is a natural number. 

4. Show that this equation also holds if n is a rational number. Use 

Since every real number is the limit of a sequence of rational numbers, we may now confidently 
proceed on the assumption that ( x n )' = ni"^ 1 holds for all real numbers n. 

5 


Taylor series 

A well-behaved function can be expanded into a power series. This means that for all 
non-negative integers k there are real numbers a& such that 

OO 

f(x ) = ^ akX k = ao + a ix + ci2X 2 + C13X 3 + (I4X 4 H 

k = 0 

Let us calculate the first four derivatives using ( x n )' = nx n_1 : 
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f'(x) = cii + 2 ( 12 X + 3 azx 2 + 4a 4 x 3 + 5asx 4 H 

f"(x) = 2ci2 + 2 • 3a3X + 3 • 4a 4 x 2 + 4 • 5asx 3 H 

f"'(x) = 2- 3 03 + 2- 3- 4 a 4 x + 3-4-5 a§x 2 H 

f""(x) = 2- 3- 4 04 + 2- 3- 4 • 5 052+ 

Setting x equal to zero, we obtain 

/(0) = a 0 , /'(0) = ai, /"(0) = 2 a 2 , /'"(0) = 2 x 3a 3 , /""(0) = 2x3x4a 4 . 

Let us write f^ n \x) for the n-th derivative of /(x). We also write f^°\x) = /(x ) — think of 
/(x) as the "zeroth derivative" of /(x). We thus arrive at the general result f^ k \ 0 ) = /da*,, 
where the factorial k\ is defined as equal to 1 for k = 0 and k = 1 and as the product of all 
natural numbers n<k for k > 1. Expressing the coefficients in terms of the derivatives of 
/(x) at x = 0, we obtain 

m = Er = 0 = m + m* +rmi + n o) £+■■■ 

This is the Taylor series for /(x). 

A remarkable result: if you know the value of a well-behaved function /(x) and the values 
of all of its derivatives at the single point x = 0 then you know /(x) at all points x. Besides, 
there is nothing special about x = 0, so /(x) is also determined by its value and the values 
of its derivatives at any other point xo: 

m= Efc°=o z ^(*-®o) fe . 

6 

The exponential function 

We define the function exp(x) by requiring that 
exp'(x) = exp(x) and exp(0) = 1. 

The value of this function is everywhere equal to its slope. Differentiating the first defining 
equation repeatedly we find that 

exp^(x) = exp^ n_1 - ) (x) = • • • = exp(x). 

6 http : //en . wikibooks . org/wiki/Category°/ 0 3A 
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The second defining equation now tells us that exp( fc l(0) = 1 for all k. The result is a 
particularly simple Taylor series: 

exp(x) = E“=o^ = l+^+T + T + fi + -" 

Let us check that a well-behaved function satisfies the equation 


f(a)f(b) = f(a + b) 


if and only if 


/«+fc)(0) = /W(0) / (fc)(0). 

We will do this by expanding the /' s in powers of a and b and compare coefficents. We have 


/(a) /(&) = ££ 


i = 0 fc=0 


/W(0)/(*)(0) i, k 

i\k\ 


and using the binomial expansion 




we also have that 


f(a + b) = ^2 — — {a + b) 1 


/ W ( o) a ,_ 


i = 0 




1 

i = 0 fc=0 


/h+fc)(0) 


i!fc! 


Voila. 

The function exp(x) obviously satisfies /h+ fc )( 0) = /W(0)/W(0) and hence f(a)f(b) = 
f(a + b). 

So does the function /(x) = exp(ux). 

Moreover, /h+ fc )( 0) = /W(0)/^(0) implies /^(0) = [/ / (0)] ri . 

We gather from this 

• that the functions satisfying /(a) f(b) = f(a + b ) form a one-parameter family, the param- 
eter being the real number f( 0), and 

• that the one-parameter family of functions exp(ux) satisfies f(a)f(b) = f(a + 6), the 
parameter being the real number u. 
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But f(x) = v x also defines a one-parameter family of functions that satisfies f(a)f(b) = 
/(a + 6), the parameter being the positive number v. 

Conclusion: for every real number u there is a positive number v (and vice versa) such that 
v x = exp (ux). 

One of the most important numbers is e, defined as the number v for which u = 1 , that is: 
e x = exp(x): 


00 1 11 

e = exp(l) = V — = 1 + 1 + - + - + -.. = 2.7182818284590452353602874713526 . . . 
n! 2 6 

n= 0 

The natural logarithm ln(x) is defined as the inverse of exp(x), so exp[ln(x)] = ln[exp(x)] = x. 
Show that 

dlnf(x) = 1 df 

dx f(x) dx 

Hint: differentiate exp{ln[/(x)]}. 
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The indefinite integral 

How do we add up infinitely many infinitesimal areas? This is elementary if we know a 
function F(x) of which f(x) is the first derivative. If f(x) = then dF(x) = f(x) dx and 

[ b f{x)dx= [ b dF(x) = F{b)-F{a). 

J a J a 

All we have to do is to add up the infinitesimal amounts dF by which F(x) increases as x 
increases from a to b, and this is simply the difference between F{b) and F(a). 

A function F{x) of which f(x) is the first derivative is called an integral or antiderivative of 
f(x). Because the integral of f(x) is determined only up to a constant, it is also known as 
indefinite integral of f(x). Note that wherever f{x) is negative, the area between its graph 
and the x axis counts as negative. 

How do we calculate the integral I = J^dx f(x) if we don't know any antiderivative of the 
integrand f(x)7 Generally we look up a table of integrals. Doing it ourselves calls for a 
significant amount of skill. As an illustration, let us do the Gaussian integral 


1 = 



dxe~ x2/2 . 


7 http : //en . wikibooks . org/wiki/Category°/ 0 3A 


107 



Appendix 


For this integral someone has discovered the following trick. (The trouble is that different 
integrals generally require different tricks.) Start with the square of I: 

/ +°° o f+oo „ r+oo r+oo , „ 0 . . 

dxe~ x I I dye~ v = [ [ dxdye~^ x +v ^ . 

-oo J — oo J — oo J — oo 

This is an integral over the x—y plane. Instead of dividing this plane into infinitesimal 
rectangles dxdy , we may divide it into concentric rings of radius r and infinitesimal width dr. 
Since the area of such a ring is 2irrdr, we have that 


r+oo n , 

I 2 = 2ir / drre ~ r ~ / 2 . 

Jo 

Now there is only one integration to be done. Next we make use of the fact that = 2 r, 
hence drr = d{r 2 / 2), and we introduce the variable w = r 2 / 2: 


I 2 = 2vr| o + °°d(r 2 /2) e ' 2 = 2vr j^°° dw e ~ w . 
Since we know that the antiderivative of e~ w is —e~ w , we also know that 



(-e-°°)-(-e- 0 ) = 0 + 1 = 1. 


Therefore I 2 = 2tt and 


/ +oo , , , 

dxe~ x ! 2 = 

-OO 

Believe it or not, a significant fraction of the literature in theoretical physics concerns 
variations and elaborations of this basic Gaussian integral. 

One variation is obtained by substituting y/ax for x: 


f +0 ° dxe~ ax2/2 = J2T^Ja. 

J — OO 

Another variation is obtained by thinking of both sides of this equation as functions of a 
and differentiating them with respect to a. The result is 


f + °° dxe~ ax2 / 2 x 2 = J 2^3. 

J —oo 


8 http : //en . wikibooks . org/wiki/Category"/ 0 3A 
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Sine and cosine 

We define the function cos(x) by requiring that 

cos w (x) = — cos(x), cos(O) = 1 and cos^O) = 0. 

If you sketch the graph of this function using only this information, you will notice that 
wherever cos(x) is positive, its slope decreases as x increases (that is, its graph curves 
downward), and wherever cos(x) is negative, its slope increases as x increases (that is, its 
graph curves upward). 

Differentiating the first defining equation repeatedly yields 

cos ( ' n+2 ' ) (a:) = — cos ^ n \x) 

for all natural numbers n. Using the remaining defining equations, we find that cos^(0) 
equals 1 for k = 0,4,8,12. . . , -1 for k = 2,6,10,14. . . , and 0 for odd k. This leads to the 
following Taylor series: 


cos 


00 / 1 \n„2n 


2 4 6 

X X £ 


n ( 2n ) ! 
n = 0 v ' 

The function sin(x) is similarly defined by requiring that 


2! 4! 6! 


sin^x) = — sin(x), sin(0) = 0, and sin^O) = 1. 
This leads to the Taylor series 


sin(x) = 


(— l) n x 2n+1 x 3 x 5 x 7 


^ (2n+ 1)! 


n = 0 


4! 5! 7! ~ 


9 

7.2.2 Complex numbers 

The natural numbers 10 are used for counting. By subtracting natural numbers from natural 
numbers, we can create integers 11 that are not natural numbers. By dividing integers 
by integers (other than zero) we can create rational numbers 12 that are not integers. By 
taking the square roots of positive rational numbers we can create real numbers 13 that are 


9 http ://en. wikibooks . org/wiki/Category’/ 0 3A 

10 http : //en . Wikipedia . org/wiki/NaturaT/ 0 20number 

11 http : //en . Wikipedia . org/wiki/Integers 

12 http : //en . Wikipedia . org/wiki/RationaT/ 0 20number 

13 http : //en . Wikipedia . org/wiki/ReaT/ 0 20number 
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irrational 14 . And by taking the square roots of negative numbers we can create complex 
numbers 15 that are imaginary 16 . 

Any imaginary number is a real number multiplied by the positive square root of —1, for 
which we have the symbol i = +V~ 1- 

Every complex number 2 is the sum of a real number a (the real part 17 of z) and an imaginary 
number ib. Somewhat confusingly, the imaginary part 18 of z is the real number b. 

Because real numbers can be visualized as points on a line, they are also referred to as (or 
thought of as constituting) the real line 19 . Because complex numbers can be visualized as 
points in a plane, they are also referred to as (or thought of as constituting) the complex 
plane 20 . This plane contains two axes, one horizontal (the real axis constituted by the real 
numbers) and one vertical (the imaginary axis constituted by the imaginary numbers). 

Do not be mislead by the whimsical tags "real 1 ' and "imaginary". No number is real in 
the sense in which, say, apples are real. The real numbers are no less imaginary in the 
ordinary sense than the imaginary numbers, and the imaginary numbers are no less real in 
the mathematical sense than the real numbers. If you are not yet familiar with complex 
numbers, it is because you don't need them for counting or measuring. You need them for 
calculating the probabilities of measurement outcomes. 



14 http ://en. Wikipedia. org/wiki/IrrationaT/ 0 20number 

15 http ://en. Wikipedia. org/wiki/Complex"/ 0 20number 

16 http ://en. Wikipedia. org/wiki/Imaginary7,20number 

17 http ://en. Wikipedia. org/wiki/ReaT/,20part 

18 http ://en. Wikipedia. org/wiki/Imaginary7,20part 

19 http ://en. Wikipedia. org/wiki/ReaT/201ine 

20 http ://en. Wikipedia. org/wiki/Complex7o20plane 


110 


Mathematical tools 


This diagram illustrates, among other things, the addition of complex numbers: 


Z 1 + Z 2 = (ai + ibi) + (a 2 + ib 2 ) = (ai + a 2 ) + i(h + b 2 ). 

As you can see, adding two complex numbers is done in the same way as adding two vectors 21 
(a, b) and (c,d) in a plane. 

Instead of using rectangular coordinates specifying the real and imaginary parts of a complex 
number, we may use polar coordinates specifying the absolute value or modulus r = \z\ and 
the complex argument or phase 22 a, which is an angle measured in radians 23 . Here is how 
these coordinates are related: 


a = rcosct, 


b = r sin a, r = + \/a 2 + b 2 , 


(Remember Pythagoras 24 ? ) 


arctan(^) 

if 

a 

> 

0 





arctan(-) + 7r 

if 

a 

< 

0 

and 

b 

> 

0 

arctan(^) — n 

if 

a 

< 

0 

and 

b 

< 

0 

-1 -7L 

if 

a 

= 

0 

and 

b 

> 

0 

7T 

2 

if 

a 

= 

0 

and 

b 

< 

0 


or a = 


+ arccos( 
— arccos( 


if b>0 
if 6<0 


All you need to know to be able to multiply complex numbers is that i 2 = —1: 


ziz 2 = ( ai + ibi)(a 2 + ib 2 ) = (oia 2 - b\b 2 ) + i(a\b 2 + bia 2 ). 

There is, however, an easier way to multiply complex numbers. Plugging the power series 25 
(or Taylor series 26 ) for cos and sin, 


cos x = y , , . . 

h ( 2k > ! 


6 


(-^ T 2k = i X l + X l_ X _ 

2! 4! 6! 




Q e: 7 

rjrt I 


since 


= y _ 1 D_ x 2fc +i = x + + 

/ J /01. 1 1 H ^ ^ ol T rl *7 1 ~***’ 


3! 5! 7! 


s( 2fe+1 ) ! 

into the expression cosa + ?'sincc and rearranging terms, we obtain 


2 ^ http : //en . Wikipedia . org/wiki/Vectory o 20yo28spatiaiyo29yo23Vectoryo20additiony o 20andyo 

20subtraction 

22 http : //mathworld . wolfram. com/ComplexNumber . html 

23 http : //en . Wikipedia . org/wiki/Radian 

24 http : //en . Wikipedia . org/wiki/Pythagorean'/ 0 20theorem 

25 http : //en . Wikipedia . org/wiki/Power'/ 0 20series 

26 http : //en . Wikipedia . org/wiki/Taylor"/o20series 
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OO 


E 


( ix) k 

k\ 


( ix ) 2 ( ix ) 3 ( ix ) 4 

1 + “ + 2T + 3T + 4T + 


(ix) 5 

5! 


(ix) £ 

(i! 


+ 


(ix) 7 

7! 


But this is the power/Taylor series for the exponential function 27 e y with y = ix! Hence 
Euler's formula 28 


e ia = cos a + i sin a, 

and this reduces multiplying two complex numbers to multiplying their absolute values and 
adding their phases: 

(zi) (z 2 ) = ne iai r 2 e ia2 = ( nr 2 ) e i(ai+a2) . 

An extremely useful definition is the complex conjugate 29 z* = a — ib oi z = a + ib. Among 
other things, it allows us to calculate the absolute square \z\ 2 by calculating the product 

zz* = (a + ib)(a — ib) = a 2 + b 2 . 

1. Show that 

gix _|_ ix gix ix 

cos x = and sin x = . 

2 2 i 

2 . Arguably the five most important numbers are 0,l,i,7r,e. Write down an equation 
containing each of these numbers just once. (Answer?) 30 

31 


7.2.3 Vectors (spatial) 

A vector 32 is a quantity that has both a magnitude and a direction. Vectors can be 
visualized as arrows. The following figure shows what we mean by the components ( a x ,a y ,a z ) 
of a vector a. 


27 http : //en . Wikipedia. org/wiki/Exponential°/o20f unction 

28 http : //en. Wikipedia. org/wiki/Euler"/o27sy o 20f ormula 

29 http : //en. Wikipedia. org/wiki/Complex"/ 0 20con jugate 

30 http : //en . Wikipedia. org/wiki/Euler"/ 0 27sy o 20identity 

31 http : //en . wikibooks . org/wiki/Category"/ 0 3A 

32 http : //en . Wikipedia. org/wiki/Vectory o 20y o 28spatiaiy o 29 
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y 



Figure 74 


The sum a + b of two vectors has the components (a x + b x , a y + b y ,a z + b z ). 

• Explain the addition of vectors in terms of arrows. 

The dot product ? 3 of two vectors is the number 

a • b — a x b x T ctyby T o z b z . 

Its importance arises from the fact that it is invariant under rotations 34 . To see this, we 
calculate 


(a + b) • (a + b) — ( a x + 6 X ) 2 + {o>y + b y ) 2 + ( a z + b z )~ — 

a\ T cl ^ T Qj^, T 6^ T 6^ T 6^ T 2 ( ci x b x T ci y by ct z b z ) = a • a b • b T 2a • b. 

According to Pythagoras 35 , the magnitude of a is a = a 2 + a 2 + a^. If we use a different 
coordinate system, the components of a will be different: (a x ,a y ,a z ) — > ( a r x ,a' y ,a' z ). But 


33 http : //en . Wikipedia . org/wiki/Dot"/ 0 20product 

34 http : //en . Wikipedia . org/wiki/Rotationy o 20yo28mathematicsy o 29 

35 http : //en . Wikipedia . org/wiki/Pythagorean’^Otheorem 


113 



Appendix 


if the new system of axes differs only by a rotation and/or translation 36 of the axes, the 
magnitude of a will remain the same: 


yjal + al + a% = \J (a' x ) 2 + {a' y ) 2 + (a' z ) 2 . 

The squared magnitudes a -a, b-b, and (a + b) • (a + b) are invariant under rotations, and 
so, therefore, is the product a-b. 

• Show that the dot product is also invariant under translations. 

Since by a scalar we mean a number that is invariant under certain transformations (in this 
case rotations and/or translations of the coordinate axes), the dot product is also known as 
(a) scalar product. Let us prove that 


a • b = afecos#, 

where 8 is the angle between a and b. To do so, we pick a coordinate system T in which 
a = (a, 0,0). In this coordinate system a-b = ab x with b x = b cos#. Since a-b is a scalar, and 
since scalars are invariant under rotations and translations, the result a-b = ab cos 8 (which 
makes no reference to any particular frame) holds in all frames that are rotated and/or 
translated relative to T. 

We now introduce the unit vectors x,y,z, whose directions are defined by the coordinate axes. 
They are said to form an orthonormal basis. Ortho because they are mutually orthogonal: 

A A A A A A r\ 

x-y = x- z = y- z = (J. 

Normal because they are unit vectors: 

A A A A A A -1 

x-x = y*y — zz = 1. 

And basis because every vector v can be written as a linear combination 37 of these three 
vectors — that is, a sum in which each basis vector appears once, multiplied by the 
corresponding component of v (which may be 0): 


v = v x Z + v y y + v z z. 

It is readily seen that Ur = x • v, v y = y • v, Uj, = z • v, which is why we have that 

v = x (x • v) + y (y • v) + z (z • v) . 

Another definition that is useful (albeit only in a 3-dimensional space) is the cross product 8 
of two vectors: 


36 http ://en. Wikipedia. org/wiki/Translationy o 20y o 28geometryy o 29 

37 http : //en . Wikipedia. org/wiki/Lineary o 20combination 

38 http : //en. Wikipedia. org/wiki/Cross"/ 0 20product 
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a x b = ( a y b z - a z b y ) x + ( a z b x - a x b z ) y + ( a x b y - a y b x ) z. 

• Show that the cross product is antisymmetric: a x b = — b x a. 

As a consequence, a x a = 0. 

• Show that a • (a x b) = b • (a x b) = 0. 

Thus a x b is perpendicular to both a and b. 

• Show that the magnitude of a x b equals aft sin a, where a is the angle between a and b. 
Hint: use a coordinate system in which a = (a, 0,0) and b = (6cosa,6sina,0). 

Since a&sina is also the area A of the parallelogram P spanned by a and b, we can think 
of a x b as a vector of magnitude A perpendicular to P. Since the cross product yields a 
vector, it is also known as vector product. 

(We save ourselves the trouble of showing that the cross product is invariant under translations 
and rotations of the coordinate axes, as is required of a vector. Let us however note in 
passing that if a and b are polar vectors, then a x b is an axial vector. Under a reflection 
(for instance, the inversion of a coordinate axis) an ordinary (or polar ) vector is invariant, 
whereas an axial vector 39 changes its sign.) 

Here is a useful relation involving both scalar and vector products: 


a x (b x c) = b(c • a) — (a • b)c. 


40 


7.2.4 Fields 

As you will remember, a function is a machine that accepts a number and returns a number. 
A field is a function that accepts the three coordinates of a point or the four coordinates of 
a spacetime point and returns a scalar, a vector, or a tensor (either of the spatial variety or 
of the 4-dinrensional spacetime variety). 


Gradient 

Imagine a curve C in 3-dimensional space. If we label the points of this curve by some 
parameter A, then C can be represented by a 3- vector function r(A). We are interested in 
how much the value of a scalar field f(x,y,z) changes as we go from a point r(A) of C to 
the point r(A + dA) of C. By how much / changes will depend on how much the coordinates 
( x,y,z ) of r change, which are themselves functions of A. The changes in the coordinates are 
evidently given by 


39 http : //en . Wikipedia . org/wiki/Pseudovector 

40 http : //en . wikibooks . org/wiki/Category°/ 0 3A 
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(*) dx = — dX, dy = — dX, dz=—dX, 
dX dX dX 

while the change in / is a compound of three changes, one due to the change in x, one due 
to the change in y, and one due to the change in z: 

.**. ,, df df df 

( ) df = -j- dx + -d- dy + -d- dz. 
dx dy dz 

The first term tells us by how much / changes as we go from ( x,y,z ) to (x+dx,y,z), the 
second tells us by how much / changes as we go from ( x,y,z ) to (x,y+dy,z), and the third 
tells us by how much / changes as we go from (x,y,z) to ( x,y,z+dz ). 

Shouldn't we add the changes in / that occur as we go first from ( x,y,z ) to (x+dx,y,z), then 
from (x+dx,y,z) to (. x+dx,y+dy,z ), and then from (x+dx, y+dy, z) to (x+dx,y+dy,z+dz)7 
Let's calculate. 


df(x+dx,y,z) 

dy 


d 


f(x,y,z) + %dx 


dy 


df(x,y,z) 

dy 


d 2 f 

dy dx 


dx. 


If we take the limit dx 0 (as we mean to whenever we use dx), the last term vanishes. 
Hence we may as well use d ^Qy’ z ' > in place of c) ^ x+ ^'V ^ . Plugging (*) into (**), we obtain 


/ df dx df dy df dz \ 
\dx dX^ dy dX dz dXj 


dX. 


Think of the expression in brackets as the dot product of two vectors: 

• the gradient ^ of the scalar field /, which is a vector field with components 

• the vector which is tangent on C. 

If we think of A as the time at which an object moving along C is at r(A), then the magnitude 
of is this object's speed. 

is a differential operator that accepts a function /( r) and returns its gradient 
The gradient of / is another input-output device: pop in dr, and get the difference 


df 

dr 


■ dr 


df = f(r + dr)~ f( r). 


The differential operator Jp 


is also used in conjunction with the dot and cross products. 


Curl 

The curl of a vector field A is defined by 
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curl A 


d_ 

dr 


x A = 




x + 



| ( dA V 
dx / V dx 



To see what this definition is good for, let us calculate the integral f A • dr over a closed 
curve C. (An integral over a curve is called a line integral , and if the curve is closed it is 
called a loop integral.) This integral is called the circulation of A along C (or around the 
surface enclosed by C). Let's start with the boundary of an infinitesimal rectangle with 
corners A = (0,0,0), B = (0,dy,0), C = (0 ,dy,dz), and D = (0,0, cfe). 



Figure 75 


The contributions from the four sides are, respectively, 

• AB : A y (0,dy/2,0)dy, 

• BC : A z (0,dy,dz/2)dz= A z (0,0,dz/2) + ^dy dz, 

• CD: -A y (0,dy/2,dz)dy = - A y (0,dy/2,0) + ^ L dz dy, 

• DA: —A z (0,0,dz/2)dz. 

These add up to (***) 

(***) dydz = ( cmlA -)xdydz. 
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Figure 76 


Let us represent this infinitesimal rectangle of area dydz (lying in the y-z plane) by a vector 
dT whose magnitude equals dT = dydz, and which is perpendicular to the rectangle. (There 
are two possible directions. The right-hand rule illustrated on the right indicates how the 
direction of dT, is related to the direction of circulation.) This allows us to write (***) as 
a scalar (product) curlA-effi. Being a scalar, it it is invariant under rotations either of 
the coordinate axes or of the infinitesimal rectangle. Hence if we cover a surface T with 
infinitesimal rectangles and add up their circulations, we get / s curl A • c?S . 

Observe that the common sides of all neighboring rectangles are integrated over twice in 
opposite directions. Their contributions cancel out and only the contributions from the 
boundary dT of T survive. 

The bottom line: A ■ dr = curl A • dT. 
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Figure 77 


This is Stokes' theorem. Note that the left-hand side depends solely on the boundary <9£ 
of E. So, therefore, does the right-hand side. The value of the surface integral of the curl of 
a vector field depends solely on the values of the vector field at the boundary of the surface 
integrated over. 

If the vector field A is the gradient of a scalar field /, and if C is a curve from A to b, then 


dr = j^df = /(b) — /(A). 

The line integral of a gradient thus is the same for all curves having identical end points. If 
b = A then C is a loop and j c A • dr vanishes. By Stokes' theorem it follows that the curl of 
a gradient vanishes identically: 


( curl 


It \ 


Of 

dr 




df 

as 9r 


■ dr = 0. 


Divergence 

The divergence of a vector field A is defined by 


d 


div A = — • A = 

dr dx 


dA r dA„ dA, 


+ 


+ 


dy dz 


To see what this definition is good for, consider an infinitesimal volume element d 3 r with 
sides dx,dy,dz. Let us calculate the net (outward) flux of a vector field A through the 
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surface of d 3 r. There are three pairs of opposite sides. The net flux through the surfaces 
perpendicular to the x axis is 


A x (x + dx, y, z) dydz -A x (x, y, z) dydz 


dA x 

dx 


dxdydz. 


It is obvious what the net flux through the remaining surfaces will be. The net flux of A 
out of d 3 r thus equals 


' dA x 8A y dA z ' 
dx dy dz 


dx dy dz = div A d 3 r. 


If we fill up a region R with infinitesimal parallelepipeds and add up their net outward 
fluxes, we get f R div Ad 3 ?’. Observe that the common sides of all neighboring parallelepipeds 
are integrated over twice with opposite signs — the flux out of one equals the flux into the 
other. Hence their contributions cancel out and only the contributions from the surface dR 
of R survive. The bottom line: 


/ A -dS= / div A d 3 r. 

JdR JR 

This is Gauss' law. Note that the left-hand side depends solely on the boundary dR of R. 
So, therefore, does the right-hand side. The value of the volume integral of the divergence of 
a vector field depends solely on the values of the vector field at the boundary of the region 
integrated over. 

If E is a closed surface — and thus the boundary dR or a region of space R — then E itself 
has no boundary (symbolically, d E = 0). Combining Stokes' theorem with Gauss' law we 
have that 


f A • dr = / curl A • dT, = / div curl A d 3 r. 

JddR JdR Jr 

The left-hand side is an integral over the boundary of a boundary. But a boundary has no 
boundary! The boundary of a boundary is zero: dd = 0. It follows, in particular, that the 
right-hand side is zero. Thus not only the curl of a gradient but also the divergence of a 
curl vanishes identically: 


d df n 


d d 

A = 0. 

dr dr 


Some useful identities 


* x (^ xA )s (A '*)~(*'^) A 
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7.3 The ABCs of relativity 

See also the Wikibook Special relativity 42 that contains an in-depth text on this subject. 

7.3.1 The principle of relativity 

If we use an inertial system 43 (a.k.a. inertial coordinate system, inertial frame of reference, 
or inertial reference frame), then the components x,y,z of the position of any freely moving 
classical object ("point mass") change by equal amounts Ax, Ay, A z in equal time intervals At. 
Evidently, if T\ is an inertial frame then so is a reference frame J - 2 that is, relative to T \ , 

1. shifted ("translated") in space by any distance and/or in any direction, 

2. translated in time by any interval, 

3. rotated by any angle about any axis, and/or 

4. moving with any constant velocity. 

The principle of relativity states that all inertial systems are "created equal": the laws of 
physics are the same as long as they are formulated with respect to an inertial frame — 
no matter which. (Describing the same physical event or state of affairs using different 
inertial systems is like saying the same thing in different languages.) The first three items 
tell us that one inertial frame is as good as any other frame as long as the other frame 
differs by a shift of the coordinate origin in space and/or time and/or by a rotation of the 
spatial coordinate axes. What matters in physics are relative positions (the positions of 
objects relative to each other), relative times (the times of events relative to each other), and 
relative orientations (the orientations of objects relative to each other), inasmuch as these 
are unaffected by translations in space and/or time and by rotations of the spatial axes. In 
the physical world, there are no absolute positions, absolute times, or absolute orientations. 

The fourth item tells us, in addition, that one inertial frame is as good as any other frame as 
long as the two frames move with a constant velocity relative to each other. What matters 
are relative velocities (the velocities of objects relative to each other), inasmuch as these are 
unaffected by a coordinate boost — the switch from an inertial frame J 7 to a frame moving 
with a constant velocity relative to J- . In the physical world, there are no absolute velocities 
and, in particular, there is no absolute rest. 

It stands to reason. For one thing, positions are properties of objects, not things that exist 
even when they are not "occupied" or possessed. For another, the positions of objects are 
defined relative to the positions of other objects. In a universe containing a single object, 
there is no position that one could attribute to that object. By the same token, all physically 
meaningful times are the times of physical events, and they too are relatively defined, as 
the times between events. In a universe containing a single event, there is not time that one 


41 http : //en . wikibooks . org/wiki/Category"/ 0 3A 

42 http : //en . wikibooks . org/wiki/SpeciaT/ 0 20relativity 

43 http : //en . Wikipedia . org/wiki/InertiaT/ 0 20f rame'/ 0 20of °/ 0 2Oref erence 
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could attribute to that event. But if positions and times are relatively defined, then so are 
velocities. 

That there is no such thing as absolute rest has not always been as obvious as it should 
have been. Two ideas were responsible for the erroneous notion that there is a special class 
of inertial frames defining "rest" in an absolute sense: the idea that electromagnetic effects 
are transmitted by waves, and the idea that these waves require a physical medium (dubbed 
"ether") for their propagation. If there were such a medium, one could define absolute rest 
as equivalent to being at rest with respect to it. 

44 


7.3.2 Lorentz transformations (general form) 

We want to express the coordinates t and r = (x, y , z) of an inertial frame J- \ in terms of the 
coordinates t' and r' = ( x',y',z ') of another inertial frame J~ 2 - We will assume that the two 
frames meet the following conditions: 

1. their spacetime coordinate origins coincide (t'=0,r'=0 mark the same spacetime 
location as t=0,r=0), 

2. their space axes are parallel, and 

3. J- 2 moves with a constant velocity w relative to J-±. 

What we know at this point is that whatever moves with a constant velocity in J-\ will do so 
in J- 2 . It follows that the transformation t,r — > t' ,r' maps straight lines in J-\ onto straight 
lines in J~ 2 - Coordinate lines of J-\ , in particular, will be mapped onto straight lines in J~ 2 - 
This tells us that the dashed coordinates are linear combinations of the undashed ones, 

t' = At + Br, r=Cr + (D-r)w+t. 

We also know that the transformation from J-\ to J ~2 can only depend on w, so A, B, C, D, 
and 

arefunctionsof 

w.Ourtaskistof indthese functions. Thereal — valuedf unctions A.and Cactuallycandependonlyonw=\w 
+ ybv • w, so A = a(w) and C = c(w). A vector function depending only on w must be 
parallel (or antiparallel) to w, and its magnitude must be a function of w. We can therefore 
write B = b(w) w, D = [d(w)/w 2 ] w, and = e(w) w. (It will become clear in a moment why 
the factor w~ 2 is included in the definition of D.) So, 

t' = a(w)t + b(w) w-r, r = c(w)r + d(w) — ^-w + e(rc)wf. 

w z 

Let's set r equal to w t. This implies that r' = (c + d+e)wt. As we are looking at the 
trajectory of an object at rest in JL, r' must be constant. Hence, 


c+d + e = 0. 
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Let's write down the inverse transformation. Since T\ moves with velocity — w relative to 
J~2i it is 


t = a(w) t! — b(w ) w • r' , r = c(w) r' + d(w) — ^ - 

w z 

To make life easier for us, we now chose the space axes so that w 
two (mutually inverse) transformations simplify to 

t' = at + bwx, x = cx + dx + ewt, y = cy , 


w — e(w)wt' . 

= (re, 0,0). Then the above 

z! = cz, 


t = at' — bwx', x = ex' + dx' — ewt' , y = cy', z = cz' . 

Plugging the first transformation into the second, we obtain 
t = a(at + bwx) — bw(cx + dx + ewt) = ( a 2 — beiu 2 )t + ( abw — bew — bdw)x , 
x = c(cx + dx + ewt) + d(cx + dx + ewt) — ew(at + bwx) 

= (c 2 + 2cd + d 2 — bew 2 )x + (cew + dew — aew)t, 

V = c 2 y , 

o 

Z = C Z. 

The first of these equations tells us that 
a 2 — bew 2 = 1 and abw — bew — bdw = 0. 

The second tells us that 

c 2 + 2cd + d 2 — bew 2 = \ and cew + dew — aew = 0. 

Combining abw — bew — bdw = 0 with c + d + e = 0 (and taking into account that w / 0) , we 
obtain b(a + e) = 0. 

Using c + d + e = 0 to eliminate d, we obtain e 2 — bew 2 = 1 and e(a + e) = 0. 

Since the first of the last two equations implies that e ^ 0, we gather from the second that 
e = —a. 

y = c 2 y tells us that c 2 = 1. c must, in fact, be equal to 1, since we have assumed that the 
space axes of the two frames a parallel (rather than antiparallel). 

With c = 1 and e = —a, c + d + e = 0 yields d = a — 1. Upon solving e 2 — bew 2 = 1 for b , we 
are left with expressions for b,c,d, and e depending solely on a: 

, !-a 2 

b= 7T-, c = 1, d = a—l, e = —a. 

aw z 

Quite an improvement! 

To find the remaining function a(iu), we consider a third inertial frame J- 3, which moves 
with velocity v = (u,0,0) relative to J- 2. Combining the transformation from T\ to T 2, 
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a(w) w 

with the transformation from to T 3, 


/ / \ 1 — a 2 (w) , 

t =a(w)t-\ - — r x, x = a(w)x — a{w)wt, 


.// / -> ,/ | 1 gt (' u ) , u ( \ I { \ J 

t =a(v)t-\ x, x = a[y) x — a[y) vt , 

a[v) v 


we obtain the transformation from T\ to J- 3: 


t" = a{v) [a{w)t+^£x 


1 -a 2 (v) 
a(v) v 


, w X 1 -a 2 (v) 

a{v) alw) — a(w) w 

a(y) v 


a(w) x — a(w) rut 
t+ [•••! x, 


x" = a(v) a{w)x — a{w)wt —a(v)v a(w)t+- ( ^y^x 

1 — a 2 (w ) 


a(v) a(w) — a(v) v- 


a(w) w 


x — 


t. 


'k'k 


The direct transformation from to J - 3 must have the same form as the transformations 

from J 7 ! to J - 2 and from J~2 to ^3, namely 


t 


n 


a(u)t + 

'k 


1 — a 2 (u ) 
a(u) u 


n 


x 


a(u) x — a(u) ut, 

k'k 


where u is the speed of J - 3 relative to J-\. Comparison of the coefficients marked with stars 
yields two expressions for a(u), which of course must be equal: 


a(v) a(w) 



a(v) a(w) 


a(v) v 


1 — a 2 (w) 
a(w) w 


It follows that [1 — a 2 (v)\ a 2 (w)w 2 = [1 — a 2 (w)] a 2 (v)v 2 , and this tells us that 


1 — a 2 (iu) 1 — a 2 (y) 
a 2 (w)w 2 a 2 (v)v 2 


is a universal constant. Solving the first equality for a(w), we obtain 


a(w) = l/\/l + Kw 2 . 
This allows us to cast the transformation 


t' = at + bwx, x' = cx + dx + ewt, y = cy, z = cz, 
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into the form 


, t + K wx ! x — wt i , 

t = . . x = , , , y = y, z = z. 

y/l + Kw 2 \/l + ATu 2 

Trumpets, please! We have managed to reduce five unknown functions to a single constant. 
45 


7.3.3 Composition of velocities 

In fact, there are only three physically distinct possibilities. (If K / 0, the magnitude of I\ 
depends on the choice of units, and this tells us something about us rather than anything 
about the physical world.) 

The possibility K = 0 yields the Galilean transformations of Newtonian ("non-relativistic") 
mechanics: 


t' = t, r = r - wi, u = v + w, ds = dt. 

(The common practice of calling theories with this transformation law "non-relativistic" is 
inappropriate, inasmuch as they too satisfy the principle of relativity.) In the remainder of 
this section we assume that K 0. 

Suppose that object C moves with speed v relative to object B, and that this moves with 
speed w relative to object A. If B and C move in the same direction, what is the speed u 
of C relative to A? In the previous section we found that 


a(u) 


a(v) a(w ) 


1 — a 2 (v) 
a(v) v 


a(w) w , 


and that 


1 — a 2 {v) 
a 2 (v)v 2 


This allows us to write 


a(u) = a(v) a(w) — 


1 — a 2 {v) 


a(v) v a(w) w = a(v) a(w)(l — Kvw ) . 


a 2 (v)v 2 

Expressing a in terms of K and the respective velocities, we obtain 


1 


1 — Kvw 


Vl + Ku 2 n/1 + AN: Vl + Kw 2 ' 
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which implies that 


1 + Ku 2 


(1 + Kv 2 )(1 + Kw 2 ) 
(1 — Kvw ) 2 


We massage this into 


2 (1 + Kv 2 )(l + Kw 2 ) — (1 — Kvw) 2 K(v + w) 2 

(1 — Kvw) 2 (1 — Kvw) 21 


divide by K, and end up with: 


v + w 

u = — • 

1 — K vw 

Thus, unless K = 0, we don't get the speed of C relative to A by simply adding the speed 
of C relative to B to the speed of B relative to A. 


7.3.4 Proper time 

Consider an infinitesimal segment dC of a spacetime path C. In T\ it has the components 
( dt,dx,dy,dz ), in T-i it has the components (di! , dx' , dy' , dz') . Using the Lorentz transforma- 
tion in its general form, 

, t + Kwx , x — wt , , 

t = , ==, x = - - - — , y = y, z = z, 

\/l + Kw 2 \/l + Kw 2 

it is readily shown that 


{dt') 2 + K dr ■ dr = dt 2 + K dr ■ dr. 

We conclude that the expression 

ds 2 = dt 2 + K dr ■ dr = dt 2 + K(dx 2 + dy 2 + dz 2 ) 

is invariant under this transformation. It is also invariant under rotations of the spatial axes 
(why?) and translations of the spacetime coordinate origin. This makes ds a ^-scalar. 

What is the physical significance of ds? 

A clock that travels along dC is at rest in any frame in which dC lacks spatial components. 
In such a frame, ds 2 = dt 2 . Hence ds is the time it takes to travel along dC as measured by 
a clock that travels along dC. ds is the proper time (or proper duration ) of dC. The proper 
time (or proper duration) of a finite spacetime path C, accordingly, is 

f ds = j \J dt 2 + K dr ■ dr = j dt.\/l + Kv 2 . 

J C t/ C t/ c 


126 



The ABCs of relativity 


7.3.5 An invariant speed 

If K < 0, then there is a universal constant c = 1 j \J —K with the dimension of a velocity, 
and we can cast u = v + w/(l — Kvw) into the form 

v + w 

u =+-. — i~2- 
1 + vw/c z 

If we plug in v = w = c/2, then instead of the Galilean u = v + w = c, we have u= |c < c. More 
intriguingly, if object O moves with speed c relative to J~ 2 , and if J ~2 moves with speed w 
relative to J 7 !, then O moves with the same speed c relative to J-i: (w + c) / (1 + roc/ c 2 ) = c. 
The speed of light c thus is an invariant speed : whatever travels with it in one inertial frame, 
travels with the same speed in every inertial frame. 

Starting from 


ds 2 = (dt') 2 — dr' ■ dr' /c 2 = dt 2 — dr ■ dr/c 2 , 

we arrive at the same conclusion: if O travels with c relative to , then it travels the distance 
dr = cdt in the time dt. Therefore ds 2 = dt 2 — dr 2 / c 2 = 0. But then (dt') 2 — (dr') 2 /c 2 = 0, 
and this implies dr' = cdt' . It follows that O travels with the same speed c relative to J~ 2 - 

An invariant speed also exists if K = 0, but in this case it is infinite: whatever travels with 
infinite speed in one inertial frame — it takes no time to get from one place to another — 
does so in every inertial frame. 

The existence of an invariant speed prevents objects from making U-turns in spacetime. If 
K = 0, it obviously takes an infinite amount of energy to reach v = oo. Since an infinite 
amount of energy isn't at our disposal, we cannot start vertically in a spacetime diagram 
and then make a U-turn (that is, we cannot reach, let alone "exceed", a horizontal slope. 
("Exceeding" a horizontal slope here means changing from a positive to a negative slope, or 
from going forward to going backward in time.) 

If K <0, it takes an infinite amount of energy to reach even the finite speed of light. Imagine 
you spent a finite amount of fuel accelerating from 0 to 0.1c. In the frame in which you are 
now at rest, your speed is not a whit closer to the speed of light. And this remains true 
no matter how many times you repeat the procedure. Thus no finite amount of energy can 
make you reach, let alone "exceed", a slope equal to 1/c. ("Exceeding" a slope equal to 1/c 
means attaining a smaller slope. As we will see, if we were to travel faster than light in any 
one frame, then there would be frames in which we travel backward in time.) 

46 


7.3.6 The case against K > 0 

In a hypothetical world with K > 0 we can define k = 1 /y/~K (a universal constant with the 
dimension of a velocity), and we can cast u = v + w/( 1 — Kvw) into the form 
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v + w 
1 — vw / k 2 

If we plug in v = w = k/2, then instead of the Galilean u = v + w = k we have u= > k. 
Worse, if we plug in v = w = k, we obtain u = oo: if object 0 travels with speed k relative 
to J- 2 , and if J -2 travels with speed k relative to J - i (in the same direction), then O travels 
with an infinite speed relative to And if 0 travels with 2k relative to J - 2 and T 2 travels 
with 2k relative to F\ , O's speed relative to is negative: u = — | k. 

If we use units in which K = k = 1, then the invariant proper time associated with an 
infinitesimal path segment is related to the segment's inertial components via 

ds 2 = dt 2 + dx 2 + dy 2 + dz 2 . 

This is the 4-dimensional version of the 3-scalar dx 2 + dy 2 + dz 2 , which is invariant under 
rotations in space. Hence if K is positive, the transformations between inertial systems are 
rotations in spacetime. I guess you now see why in this hypothetical world the composition 
of two positive speeds can be a negative speed. 


Let us confirm this conclusion by deriving the composition theorem (for k= 1) from the 
assumption that the x' and t' axes are rotated relative to the x and t axes. 



Figure 78 
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The speed of an object O following the dotted line is w = cot (a + ta) relative to T' , the 
speed of T' relative to F is v = tana, and the speed of 0 relative to T is u = cot/3. Invoking 
the trigonometric relation 


tan a + tan f3 
1 — tan a tan/?’ 

we conclude that — = t,+1 / u . Solving for u, we obtain u = 

How can we rule out the a priori possibility that K > 0? As shown in the body of the book, 
the stability of matter — to be precise, the existence of stable objects that (i) have spatial 
extent (they "occupy" space) and (ii) are composed of a finite number of objects that lack 
spatial extent (they don't "occupy" space) — rests on the existence of relative positions that 
are (a) more or less fuzzy and (b) independent of time. Such relative positions are described 
by probability distributions that are (a) inhomogeneous in space and (b) homogeneous in 
time. Their objective existence thus requires an objective difference between spactime's 
temporal dimension and its spatial dimensions. This rules out the possibility that K > 0. 

How? If K < 0, and if we use natural units, in which c = 1, we have that 

ds 2 = + dt 2 — dx 2 — dy 2 — dz 2 . 

As far as physics is concerned, the difference between the positive sign in front of dt and the 
negative signs in front of dx, dy, and dz is the only objective difference between time and 
the spatial dimensions of spacetime. If K were positive, not even this difference would exist. 

7.3.7 The case against zero K 

And what argues against the possibility that K = 0? 

Recall the propagator for a free and stable particle: 



(B\A) = JvCe~ ibs[c] . 


If K were to vanish, we would have ds 2 = dt 2 . There would be no difference between inertial 
time and proper time, and every spacetime path leading from A to B would contribute the 
same amplitude e~ ib ^ B ~ tA ' ) to the propagator (B\A), which would be hopelessly divergent as 
a result. Worse, (B\A) would be independent of the distance between A and B. To obtain 
well-defined, finite probabilities, cancellations ("destructive interference") must occur, and 
this rules out that K = 0. 

7.3.8 The actual Lorentz transformations 

In the real world, therefore, the Lorentz transformations take the form 
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Let's explore them diagrammatically, using natural units (c = 1). Setting t' = 0, we have 
t = wx. This tells us that the slope of the x' axis relative to the undashed frame is w = tana. 
Setting x' = 0, we have t = x/w. This tells us that the slope of the t! axis is 1/w. The 
dashed axes are thus rotated by the same angle in opposite directions; if the t! axis is rotated 
clockwise relative to the t axis, then the x' axis is rotated counterclockwise relative to the 
x axis. 



Figure 79 


We arrive at the same conclusion if we think about the synchronization of clocks in motion. 
Consider three clocks (1,2,3) that travel with the same speed w = tana relative to J-. To 
synchronize them, we must send signals from one clock to another. What kind of signals? If 
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we want our synchronization procedure to be independent of the language we use (that is, 
independent of the reference frame), then we must use signals that travel with the invariant 
speed c. 


Here is how it's done: 



Light signals are sent from clock 2 (event A) and are reflected by clocks 1 and 3 (events B 
and C, respectively). The distances between the clocks are adjusted so that the reflected 
signals arrive simultaneously at clock 2 (event D). This ensures that the distance between 
clocks 1 and 2 equals the distance between clocks 2 and 3, regardless of the inertial frame 
in which they are compared. In T ' , where the clocks are at rest, the signals from A have 
traveled equal distances when they reach the first and the third clock, respectively. Since they 
also have traveled with the same speed c, they have traveled for equal times. Therefore the 
clocks must be synchronized so that B and C are simultaneous. We may use the worldline 
of clock 1 as the t' axis and the straight line through B and C as the x' axis. It is readily 
seen that the three angles ta in the above diagram are equal. From this and the fact that 
the slope of the signal from B to D equals 1 (given that c=l), the equality of the two angles 
a follows. 
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Simultaneity thus depends on the language — the inertial frame — that we use to describe 
a physical situation. If two events E \ , E<± are simultaneous in one frame, then there are 
frames in which E\ hapens after E -2 as well as frames in which E\ hapens before E 2 . 

Where do we place the unit points on the space and time axes? The unit point of the time 
axis of E' has the coordinates t' = 1 , x' = 0 and satisfies f 2 — x 2 = 1 , as we gather from the 
version (t') 2 — (x 1 ) 2 = t 2 — x 2 of (\ref{ds2}). The unit point of the x' axis has the coordinates 
t' = 0,x' = 1 and satisfies x 2 — t 2 = 1. The loci of the unit points of the space and time axes 
are the hyperbolas that are defined by these equations: 



47 
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7.3.9 Lorentz contraction, time dilatation 

Imagine a meter stick at rest in T' . At the time t' = 0, its ends are situated at the points O 
and C. At the time t = 0, they are situated at the points O and A, which are less than a 
meter apart. Now imagine a stick (not a meter stick) at rest in J 7 , whose end points at the 
time t' = 0 are O and C. In T' they are a meter apart, but in the stick's rest- frame they 
are at O and B and thus more than a meter apart. The bottom line: a moving object is 
contracted (shortened) in the direction in which it is moving. 



Next imagine two clocks, one ( C ) at rest in T and located at x = 0, and one (C) at rest in T' 
and located at x' = 0. At D, C indicates that one second has passed, while at E (which 
in T is simultaneous with D ), C indicates that more than a second has passed. On the other 
hand, at F (which in T' is simultaneous with D), C indicates that less than a second has 
passed. The bottom line: a moving clock runs slower than a clock at rest. 

Example: Muons (/r particles) are created near the top of the atmosphere, some ten kilometers 
up, when high-energy particles of cosmic origin hit the atmosphere. Since muons decay 
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spontaneously after an average lifetime of 2.2 microseconds, they don't travel much farther 
than 600 meters. Yet many are found at sea level. How do they get that far? 

The answer lies in the fact that most of them travel at close to the speed of light. While 
from its own point of view (that is, relative to the inertial system in which it is at rest), a 
muon only lives for about 2 microseconds, from our point of view (that is, relative to an 
inertial system in which it travels close to the speed of light), it lives much longer and has 
enough time to reach the Earth's surface. 

48 


7.3.10 4-vectors 

3 - vectors are triplets of real numbers that transform under rotations like the coordinates 
x,y,z. 4-vectors are quadruplets of real numbers that transform under Lorentz transforma- 
tions like the coordinates of x = ( ct,x,y,z ). 

You will remember that the scalar product of two 3 -vectors is invariant under rotations of 
the (spatial) coordinate axes; after all, this is why we call it a scalar. Similarly, the scalar 
product of two 4 - vectors a = (at, a) = (00,01,02,03) and b= (bt, b) = (60, &i, 62, 63), defined 
by 


(a, b) = a 0 b 0 - a x bi - a 2 b 2 - a 3 b 3 , 


is invariant under Lorentz transformations (as well as translations of the coordinate origin 
and rotations of the spatial axes). To demonstrate this, we consider the sum of two 4 - vectors 
c = a + b and calculate 


(c,c) = (d+b,a + b) = (a,a) + (b,b) + 2(a,b). 

The products (a, a), ( b,b ), and ( c,c) are invariant 4 -scalars. But if they are invariant under 
Lorentz transformations, then so is the scalar product (0,6). 

One important 4 - vector, apart from x, is the ^-velocity u = , which is tangent on the 

worldline x(s). u is a 4 - vector because x is one and because ds is a scalar (to be precise, a 
4 - scalar) . 

The norm or "magnitude" of a 4 -vector a is defined as i/| (a, a)|. It is readily shown that the 
norm of u equals c (exercise!). 

Thus if we use natural units, the 4 -velocity is a unit vector. 
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• EURO: This is the common (reverse) face of a euro coin. The copyright on the design 
of the common face of the euro coins belongs to the European Commission. Authorised 
is reproduction in a format without relief (drawings, paintings, films) provided they 
are not detrimental to the image of the euro. 

• LFK: Lizenz Freie Kunst. http://artlibre.org/licence/lal/de 

• CFR: Copyright free use. 
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• EPL: Eclipse Public License. http://www.eclipse.org/org/documents/epl-vlO. 
php 

Copies of the GPL, the LGPL as well as a GFDL are included in chapter Licenses 31 . Please 
note that images in the public domain do not require attribution. You may click on the 
image numbers in the following table to open the webpage of the images in your webbrower. 


31 Chapter 9 on page 143 
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9 Licenses 


9.1 GNU GENERAL PUBLIC LICENSE 


Version 3, 29 June 2007 

Copyright © 2007 Free Software Foundation, Inc. 
<http://fsf.org/> 

Everyone is permitted to copy and distribute verba- 
tim copies of this license document, but changing 
it is not allowed. Preamble 

The GNU General Public License is a free, copyleft 
license for software and other kinds of works. 

The licenses for most software and other practi- 
cal works are designed to take away your freedom 
to share and change the works. By contrast, the 
GNU General Public License is intended to guaran- 
tee your freedom to share and change all versions 
of a program— to make sure it remains free software 
for all its users. We, the Free Software Foundation, 
use the GNU General Public License for most of our 
software; it applies also to any other work released 
this way by its authors. You can apply it to your 
programs, too. 

When we speak of free software, we are referring 
to freedom, not price. Our General Public Li- 
censes are designed to make sure that you have 
the freedom to distribute copies of free software 
(and charge for them if you wish), that you receive 
source code or can get it if you want it, that you 
can change the software or use pieces of it in new 
free programs, and that you know you can do these 
things. 

To protect your rights, we need to prevent others 
from denying you these rights or asking you to sur- 
render the rights. Therefore, you have certain re- 
sponsibilities if you distribute copies of the soft- 
ware, or if you modify it: responsibilities to respect 
the freedom of others. 

For example, if you distribute copies of such a pro- 
gram, whether gratis or for a fee, you must pass 
on to the recipients the same freedoms that you re- 
ceived. You must make sure that they, too, receive 
or can get the source code. And you must show 
them these terms so they know their rights. 

Developers that use the GNU GPL protect your 
rights with two steps: (1) assert copyright on the 
software, and (2) offer you this License giving you 
legal permission to copy, distribute and/or modify 
it. 

For the developers’ and authors’ protection, the 
GPL clearly explains that there is no warranty for 
this free software. For both users’ and authors’ 
sake, the GPL requires that modified versions be 
marked as changed, so that their problems will not 
be attributed erroneously to authors of previous 

Some devices are designed to deny users access to 
install or run modified versions of the software in- 
side them, although the manufacturer can do so. 
This is fundamentally incompatible with the aim 
of protecting users’ freedom to change the software. 
The systematic pattern of such abuse occurs in the 
area of products for individuals to use, which is 
precisely where it is most unacceptable. Therefore, 
we have designed this version of the GPL to pro- 
hibit the practice for those products. If such prob- 
lems arise substantially in other domains, we stand 
ready to extend this provision to those domains in 
future versions of the GPL, as needed to protect 
the freedom of users. 

Finally, every program is threatened constantly by 
software patents. States should not allow patents 
to restrict development and use of software on 
general-purpose computers, but in those that do, 
we wish to avoid the special danger that patents 
applied to a free program could make it effectively 
proprietary. To prevent this, the GPL assures that 
patents cannot be used to render the program non- 

The precise terms and conditions for copying, dis- 
tribution and modification follow. TERMS AND 
CONDITIONS 0. Definitions. 

“This License” refers to version 3 of the GNU Gen- 
eral Public License. 

“Copyright” also means copyright-like laws that ap- 
ply to other kinds of works, such as semiconductor 
masks. 

“The Program” refers to any copyrightable work 
licensed under this License. Each licensee is ad- 
dressed as “you”. “Licensees” and “recipients” may 
be individuals or organizations. 

To “modify” a work means to copy from or adapt 
all or part of the work in a fashion requiring copy- 
right permission, other than the making of an exact 
copy. The resulting work is called a “modified ver- 
sion” of the earlier work or a work “based on” the 
earlier work. 

A “covered work” means either the unmodified Pro- 
gram or a work based on the Program. 

To “propagate” a work means to do anything with it 
that, without permission, would make you directly 
or secondarily liable for infringement under appli- 
cable copyright law, except executing it on a com- 
puter or modifying a private copy. Propagation in- 
cludes copying, distribution (with or without mod- 
ification), making available to the public, and in 
some countries other activities as well. 

To “convey” a work means any kind of propagation 
that enables other parties to make or receive copies. 
Mere interaction with a user through a computer 


network, with no transfer of a copy, is not convey- 
ing. 

An interactive user interface displays “Appropriate 
Legal Notices” to the extent that it includes a con- 
venient and prominently visible feature that (1) dis- 
plays an appropriate copyright notice, and (2) tells 
the user that there is no warranty for the work (ex- 
cept to the extent that warranties are provided), 
that licensees may convey the work under this Li- 
cense, and how to view a copy of this License. If 
the interface presents a list of user commands or 
options, such as a menu, a prominent item in the 
list meets this criterion. 1. Source Code. 

The “source code” for a work means the preferred 
form of the work for making modifications to it. 
“Object code” means any non-source form of a 
work. 

A “Standard Interface” means an interface that ei- 
ther is an official standard defined by a recognized 
standards body, or, in the case of interfaces spec- 
ified for a particular programming language, one 
that is widely used among developers working in 
that language. 

The “System Libraries” of an executable work in- 
clude anything, other than the work as a whole, 
that (a) is included in the normal form of packag- 
ing a Major Component, but which is not part of 
that Major Component, and (b) serves only to en- 
able use of the work with that Major Component, 
or to implement a Standard Interface for which an 
implementation is available to the public in source 
code form. A “Major Component”, in this context, 
means a major essential component (kernel, window 
system, and so on) of the specific operating system 
(if any) on which the executable work runs, or a 
compiler used to produce the work, or an object 
code interpreter used to run it. 

The “Corresponding Source” for a work in object 
code form means all the source code needed to gen- 
erate, install, and (for an executable work) run 
the object code and to modify the work, including 
scripts to control those activities. However, it does 
not include the work’s System Libraries, or general- 
purpose tools or generally available free programs 
which are used unmodified in performing those ac- 
tivities but which are not part of the work. For 
example, Corresponding Source includes interface 
definition files associated with source files for the 
work, and the source code for shared libraries and 
dynamically linked subprograms that the work is 
specifically designed to require, such as by intimate 
data communication or control flow between those 
subprograms and other parts of the work. 

The Corresponding Source need not include any- 
thing that users can regenerate automatically from 
other parts of the Corresponding Source. 

The Corresponding Source for a work in source code 
form is that same work. 2. Basic Permissions. 

All rights granted under this License are granted 
for the term of copyright on the Program, and are 
irrevocable provided the stated conditions are met. 
This License explicitly affirms your unlimited per- 
mission to run the unmodified Program. The out- 
put from running a covered work is covered by this 
License only if the output, given its content, con- 
stitutes a covered work. This License acknowledges 
your rights of fair use or other equivalent, as pro- 
vided by copyright law. 

You may make, run and propagate covered works 
that you do not convey, without conditions so long 
as your license otherwise remains in force. You may 
convey covered works to others for the sole purpose 
of having them make modifications exclusively for 
you, or provide you with facilities for running those 
works, provided that you comply with the terms 
of this License in conveying all material for which 
you do not control copyright. Those thus making or 
running the covered works for you must do so exclu- 
sively on your behalf, under your direction and con- 
trol, on terms that prohibit them from making any 
copies of your copyrighted material outside their 
relationship with you. 

Conveying under any other circumstances is permit- 
ted solely under the conditions stated below. Subli- 
censing is not allowed; section 10 makes it unneces- 
sary. 3. Protecting Users’ Legal Rights From Anti- 
Circumvention Law. 

No covered work shall be deemed part of an effec- 
tive technological measure under any applicable law 
fulfilling obligations under article 11 of the WIPO 
copyright treaty adopted on 20 December 1996, or 
similar laws prohibiting or restricting circumven- 
tion of such measures. 

When you convey a covered work, you waive any 
legal power to forbid circumvention of technologi- 
cal measures to the extent such circumvention is ef- 
fected by exercising rights under this License with 
respect to the covered work, and you disclaim any 
intention to limit operation or modification of the 
work as a means of enforcing, against the work’s 
users, your or third parties’ legal rights to forbid 
circumvention of technological measures. 4. Con- 
veying Verbatim Copies. 

You may convey verbatim copies of the Program’s 
source code as you receive it, in any medium, pro- 
vided that you conspicuously and appropriately 
publish on each copy an appropriate copyright no- 
tice; keep intact all notices stating that this License 
and any non-permissive terms added in accord with 
section 7 apply to the code; keep intact all notices 
of the absence of any warranty; and give all recipi- 
ents a copy of this License along with the Program. 


You may charge any price or no price for each copy 
that you convey, and you may offer support or war- 
ranty protection for a fee. 5. Conveying Modified 
Source Versions. 

You may convey a work based on the Program, or 
the modifications to produce it from the Program, 
in the form of source code under the terms of sec- 
tion 4, provided that you also meet all of these con- 
ditions: 

* a) The work must carry prominent notices stating 
that you modified it, and giving a relevant date. * 
b) The work must carry prominent notices stating 
that it is released under this License and any con- 
ditions added under section 7. This requirement 
modifies the requirement in section 4 to “keep in- 
tact all notices”. * c) You must license the entire 
work, as a whole, under this License to anyone who 
comes into possession of a copy. This License will 
therefore apply, along with any applicable section 7 
additional terms, to the whole of the work, and all 
its parts, regardless of how they are packaged. This 
License gives no permission to license the work in 
any other way, but it does not invalidate such per- 
mission if you have separately received it. * d) If 
the work has interactive user interfaces, each must 
display Appropriate Legal Notices; however, if the 
Program has interactive interfaces that do not dis- 
play Appropriate Legal Notices, your work need not 
make them do so. 

A compilation of a covered work with other sepa- 
rate and independent works, which are not by their 
nature extensions of the covered work, and which 
are not combined with it such as to form a larger 
program, in or on a volume of a storage or distri- 
bution medium, is called an “aggregate” if the com- 
pilation and its resulting copyright are not used to 
limit the access or legal rights of the compilation’s 
users beyond what the individual works permit. In- 
clusion of a covered work in an aggregate does not 
cause this License to apply to the other parts of the 
aggregate. 6. Conveying Non-Source Forms. 

You may convey a covered work in object code form 
under the terms of sections 4 and 5, provided that 
you also convey the machine-readable Correspond- 
ing Source under the terms of this License, in one 
of these ways: 

* a) Convey the object code in, or embodied in, 
a physical product (including a physical distribu- 
tion medium), accompanied by the Corresponding 
Source fixed on a durable physical medium custom- 
arily used for software interchange. * b) Convey the 
object code in, or embodied in, a physical product 
(including a physical distribution medium), accom- 
panied by a written offer, valid for at least three 
years and valid for as long as you offer spare parts 
or customer support for that product model, to 
give anyone who possesses the object code either 
(1) a copy of the Corresponding Source for all the 
software in the product that is covered by this Li- 
cense, on a durable physical medium customarily 
used for software interchange, for a price no more 
than your reasonable cost of physically performing 
this conveying of source, or (2) access to copy the 
Corresponding Source from a network server at no 
charge. * c) Convey individual copies of the object 
code with a copy of the written offer to provide 
the Corresponding Source. This alternative is al- 
lowed only occasionally and noncommercially, and 
only if you received the object code with such an of- 
fer, in accord with subsection 6b. * d) Convey the 
object code by offering access from a designated 
place (gratis or for a charge), and offer equivalent 
access to the Corresponding Source in the same way 
through the same place at no further charge. You 
need not require recipients to copy the Correspond- 
ing Source along with the object code. If the place 
to copy the object code is a network server, the Cor- 
responding Source may be on a different server (op- 
erated by you or a third party) that supports equiv- 
alent copying facilities, provided you maintain clear 
directions next to the object code saying where to 
find the Corresponding Source. Regardless of what 
server hosts the Corresponding Source, you remain 
obligated to ensure that it is available for as long 
as needed to satisfy these requirements. * e) Con- 
vey the object code using peer-to-peer transmission, 
provided you inform other peers where the object 
code and Corresponding Source of the work are be- 
ing offered to the general public at no charge under 
subsection 6d. 

A separable portion of the object code, whose 
source code is excluded from the Corresponding 
Source as a System Library, need not be included 
in conveying the object code work. 

A “User Product” is either (1) a “consumer prod- 
uct”, which means any tangible personal property 
which is normally used for personal, family, or 
household purposes, or (2) anything designed or 
sold for incorporation into a dwelling. In deter- 
mining whether a product is a consumer product, 
doubtful cases shall be resolved in favor of cover- 
age. For a particular product received by a par- 
ticular user, “normally used” refers to a typical or 
common use of that class of product, regardless of 
the status of the particular user or of the way in 
which the particular user actually uses, or expects 
or is expected to use, the product. A product is a 
consumer product regardless of whether the prod- 
uct has substantial commercial, industrial or non- 
consumer uses, unless such uses represent the only 
significant mode of use of the product. 

“Installation Information” for a User Product 
means any methods, procedures, authorization 
keys, or other information required to install and 
execute modified versions of a covered work in that 
User Product from a modified version of its Corre- 
sponding Source. The information must suffice to 
ensure that the continued functioning of the modi- 
fied object code is in no case prevented or interfered 
with solely because modification has been made. 


If you convey an object code work under this sec- 
tion in, or with, or specifically for use in, a User 
Product, and the conveying occurs as part of a 
transaction in which the right of possession and 
use of the User Product is transferred to the re- 
cipient in perpetuity or for a fixed term (regard- 
less of how the transaction is characterized), the 
Corresponding Source conveyed under this section 
must be accompanied by the Installation Informa- 
tion. But this requirement does not apply if neither 
you nor any third party retains the ability to install 
modified object code on the User Product (for ex- 
ample, the work has been installed in ROM). 

The requirement to provide Installation Informa- 
tion does not include a requirement to continue to 
provide support service, warranty, or updates for a 
work that has been modified or installed by the re- 
cipient, or for the User Product in which it has been 
modified or installed. Access to a network may be 
denied when the modification itself materially and 
adversely affects the operation of the network or 
violates the rules and protocols for communication 
across the network. 

Corresponding Source conveyed, and Installation 
Information provided, in accord with this section 
must be in a format that is publicly documented 
(and with an implementation available to the public 
in source code form), and must require no special 
password or key for unpacking, reading or copying. 
7. Additional Terms. 

“Additional permissions” are terms that supplement 
the terms of this License by making exceptions from 
one or more of its conditions. Additional permis- 
sions that are applicable to the entire Program 
shall be treated as though they were included in 
this License, to the extent that they are valid un- 
der applicable law. If additional permissions apply 
only to part of the Program, that part may be used 
separately under those permissions, but the entire 
Program remains governed by this License without 
regard to the additional permissions. 

When you convey a copy of a covered work, you may 
at your option remove any additional permissions 
from that copy, or from any part of it. (Additional 
permissions may be written to require their own re- 
moval in certain cases when you modify the work.) 
You may place additional permissions on material, 
added by you to a covered work, for which you have 
or can give appropriate copyright permission. 

Notwithstanding any other provision of this Li- 
cense, for material you add to a covered work, you 
may (if authorized by the copyright holders of that 
material) supplement the terms of this License with 
terms: 

* a) Disclaiming warranty or limiting liability dif- 
ferently from the terms of sections 15 and 16 of this 
License; or * b) Requiring preservation of specified 
reasonable legal notices or author attributions in 
that material or in the Appropriate Legal Notices 
displayed by works containing it; or * c) Prohibit- 
ing misrepresentation of the origin of that material, 
or requiring that modified versions of such material 
be marked in reasonable ways as different from the 
original version; or * d) Limiting the use for pub- 
licity purposes of names of licensors or authors of 
the material; or * e) Declining to grant rights under 
trademark law for use of some trade names, trade- 
marks, or service marks; or * f) Requiring indem- 
nification of licensors and authors of that material 
by anyone who conveys the material (or modified 
versions of it) with contractual assumptions of lia- 
bility to the recipient, for any liability that these 
contractual assumptions directly impose on those 
licensors and authors. 

All other non-permissive additional terms are con- 
sidered “further restrictions” within the meaning of 
section 10. If the Program as you received it, or any 
part of it, contains a notice stating that it is gov- 
erned by this License along with a term that is a 
further restriction, you may remove that term. If a 
license document contains a further restriction but 
permits relicensing or conveying under this License, 
you may add to a covered work material governed 
by the terms of that license document, provided 
that the further restriction does not survive such 
relicensing or conveying. 

If you add terms to a covered work in accord with 
this section, you must place, in the relevant source 
files, a statement of the additional terms that ap- 
ply to those files, or a notice indicating where to 
find the applicable terms. 

Additional terms, permissive or non-permissive, 
may be stated in the form of a separately written 
license, or stated as exceptions; the above require- 
ments apply either way. 8. Termination. 

You may not propagate or modify a covered work 
except as expressly provided under this License. 
Any attempt otherwise to propagate or modify it is 
void, and will automatically terminate your rights 
under this License (including any patent licenses 
granted under the third paragraph of section 11). 

However, if you cease all violation of this License, 
then your license from a particular copyright holder 
is reinstated (a) provisionally, unless and until the 
copyright holder explicitly and finally terminates 
your license, and (b) permanently, if the copyright 
holder fails to notify you of the violation by some 
reasonable means prior to 60 days after the cessa- 


Moreover, your license from a particular copyright 
holder is reinstated permanently if the copyright 
holder notifies you of the violation by some reason- 
able means, this is the first time you have received 
notice of violation of this License (for any work) 



from that copyright holder, and you cure the vi- 
olation prior to 30 days after your receipt of the 
notice. 

Termination of your rights under this section does 
not terminate the licenses of parties who have re- 
ceived copies or rights from you under this License. 
If your rights have been terminated and not perma- 
nently reinstated, you do not qualify to receive new 
licenses for the same material under section 10. 9. 
Acceptance Not Required for Having Copies. 

You are not required to accept this License in or- 
der to receive or run a copy of the Program. Ancil- 
lary propagation of a covered work occurring solely 
as a consequence of using peer-to-peer transmission 
to receive a copy likewise does not require accep- 
tance. However, nothing other than this License 
grants you permission to propagate or modify any 
covered work. These actions infringe copyright if 
you do not accept this License. Therefore, by mod- 
ifying or propagating a covered work, you indicate 
your acceptance of this License to do so. 10. Auto- 
matic Licensing of Downstream Recipients. 

Each time you convey a covered work, the recipient 
automatically receives a license from the original 
licensors, to run, modify and propagate that work, 
subject to this License. You are not responsible 
for enforcing compliance by third parties with this 
License. 

An “entity transaction” is a transaction transfer- 
ring control of an organization, or substantially all 
assets of one, or subdividing an organization, or 
merging organizations. If propagation of a cov- 
ered work results from an entity transaction, each 
party to that transaction who receives a copy of the 
work also receives whatever licenses to the work the 
party’s predecessor in interest had or could give un- 
der the previous paragraph, plus a right to posses- 
sion of the Corresponding Source of the work from 
the predecessor in interest, if the predecessor has it 
or can get it with reasonable efforts. 

You may not impose any further restrictions on the 
exercise of the rights granted or affirmed under this 
License. For example, you may not impose a license 
fee, royalty, or other charge for exercise of rights 
granted under this License, and you may not ini- 
tiate litigation (including a cross-claim or counter- 
claim in a lawsuit) alleging that any patent claim 
is infringed by making, using, selling, offering for 
sale, or importing the Program or any portion of it. 
11. Patents. 

A “contributor” is a copyright holder who autho- 
rizes use under this License of the Program or a 
work on which the Program is based. The work 
thus licensed is called the contributor’s “contribu- 
tor version”. 

A contributor’s “essential patent claims” are all 
patent claims owned or controlled by the contribu- 
tor, whether already acquired or hereafter acquired, 
that would be infringed by some manner, permit- 
ted by this License, of making, using, or selling its 
contributor version, but do not include claims that 
would be infringed only as a consequence of further 
modification of the contributor version. For pur- 
poses of this definition, “control” includes the right 
to grant patent sublicenses in a manner consistent 
with the requirements of this License. 

Each contributor grants you a non-exclusive, world- 
wide, royalty-free patent license under the contrib- 
utor’s essential patent claims, to make, use, sell, of- 
fer for sale, import and otherwise run, modify and 
propagate the contents of its contributor version. 


In the following three paragraphs, a “patent li- 
cense” is any express agreement or commitment, 
however denominated, not to enforce a patent (such 
as an express permission to practice a patent or 
covenant not to sue for patent infringement). To 
“grant” such a patent license to a party means to 
make such an agreement or commitment not to en- 
force a patent against the party. 


If you convey a covered work, knowingly relying 
on a patent license, and the Corresponding Source 
of the work is not available for anyone to copy, 
free of charge and under the terms of this License, 
through a publicly available network server or other 
readily accessible means, then you must either (1) 
cause the Corresponding Source to be so available, 
or (2) arrange to deprive yourself of the benefit 
of the patent license for this particular work, or 
(3) arrange, in a manner consistent with the re- 
quirements of this License, to extend the patent 
license to downstream recipients. “Knowingly re- 
lying” means you have actual knowledge that, but 
for the patent license, your conveying the covered 
work in a country, or your recipient’s use of the cov- 
ered work in a country, would infringe one or more 
identifiable patents in that country that you have 
reason to believe are valid. 


If, pursuant to or in connection with a single trans- 
action or arrangement, you convey, or propagate 
by procuring conveyance of, a covered work, and 
grant a patent license to some of the parties re- 
ceiving the covered work authorizing them to use, 
propagate, modify or convey a specific copy of the 
covered work, then the patent license you grant is 
automatically extended to all recipients of the cov- 
ered work and works based on it. 


A patent license is “discriminatory” if it does not in- 
clude within the scope of its coverage, prohibits the 
exercise of, or is conditioned on the non-exercise 
of one or more of the rights that are specifically 
granted under this License. You may not convey a 
covered work if you are a party to an arrangement 
with a third party that is in the business of dis- 
tributing software, under which you make payment 
to the third party based on the extent of your ac- 
tivity of conveying the work, and under which the 
third party grants, to any of the parties who would 
receive the covered work from you, a discrimina- 
tory patent license (a) in connection with copies 
of the covered work conveyed by you (or copies 
made from those copies), or (b) primarily for and in 
connection with specific products or compilations 
that contain the covered work, unless you entered 
into that arrangement, or that patent license was 
granted, prior to 28 March 2007. 


Nothing in this License shall be construed as ex- 
cluding or limiting any implied license or other de- 
fenses to infringement that may otherwise be avail- 
able to you under applicable patent law. 12. No 
Surrender of Others’ Freedom. 


If conditions are imposed on you (whether by court 
order, agreement or otherwise) that contradict the 
conditions of this License, they do not excuse you 
from the conditions of this License. If you cannot 
convey a covered work so as to satisfy simultane- 
ously your obligations under this License and any 
other pertinent obligations, then as a consequence 
you may not convey it at all. For example, if you 
agree to terms that obligate you to collect a roy- 
alty for further conveying from those to whom you 
convey the Program, the only way you could satisfy 
both those terms and this License would be to re- 
frain entirely from conveying the Program. 13. Use 
with the GNU Affero General Public License. 


Notwithstanding any other provision of this Li- 
cense, you have permission to link or combine any 
covered work with a work licensed under version 
3 of the GNU Affero General Public License into 
a single combined work, and to convey the result- 
ing work. The terms of this License will continue 
to apply to the part which is the covered work, but 
the special requirements of the GNU Affero General 
Public License, section 13, concerning interaction 
through a network will apply to the combination 
as such. 14. Revised Versions of this License. 

The Free Software Foundation may publish revised 
and/or new versions of the GNU General Public Li- 
cense from time to time. Such new versions will be 
similar in spirit to the present version, but may dif- 
fer in detail to address new problems or concerns. 

Each version is given a distinguishing version num- 
ber. If the Program specifies that a certain num- 
bered version of the GNU General Public License 
“or any later version” applies to it, you have the 
option of following the terms and conditions either 
of that numbered version or of any later version 
published by the Free Software Foundation. If the 
Program does not specify a version number of the 
GNU General Public License, you may choose any 
version ever published by the Free Software Foun- 

If the Program specifies that a proxy can decide 
which future versions of the GNU General Public 
License can be used, that proxy’s public statement 
of acceptance of a version permanently authorizes 
you to choose that version for the Program. 

Later license versions may give you additional or 
different permissions. However, no additional obli- 
gations are imposed on any author or copyright 
holder as a result of your choosing to follow a later 
version. 15. Disclaimer of Warranty. 

THERE IS NO WARRANTY FOR THE PRO- 
GRAM, TO THE EXTENT PERMITTED BY AP- 
PLICABLE LAW. EXCEPT WHEN OTHERWISE 
STATED IN WRITING THE COPYRIGHT HOLD- 
ERS AND/OR OTHER PARTIES PROVIDE THE 
PROGRAM “AS IS” WITHOUT WARRANTY OF 
ANY KIND, EITHER EXPRESSED OR IMPLIED, 
INCLUDING, BUT NOT LIMITED TO, THE IM- 
PLIED WARRANTIES OF MERCHANTABILITY 
AND FITNESS FOR A PARTICULAR PURPOSE. 
THE ENTIRE RISK AS TO THE QUALITY AND 
PERFORMANCE OF THE PROGRAM IS WITH 
YOU. SHOULD THE PROGRAM PROVE DEFEC- 
TIVE, YOU ASSUME THE COST OF ALL NECES- 
SARY SERVICING, REPAIR OR CORRECTION. 
16. Limitation of Liability. 


IN NO EVENT UNLESS REQUIRED BY APPLI- 
CABLE LAW OR AGREED TO IN WRITING 
WILL ANY COPYRIGHT HOLDER, OR ANY 
OTHER PARTY WHO MODIFIES AND/OR CON- 
VEYS THE PROGRAM AS PERMITTED ABOVE, 
BE LIABLE TO YOU FOR DAMAGES, IN- 
CLUDING ANY GENERAL, SPECIAL, INCIDEN- 
TAL OR CONSEQUENTIAL DAMAGES ARISING 
OUT OF THE USE OR INABILITY TO USE 
THE PROGRAM (INCLUDING BUT NOT LIM- 
ITED TO LOSS OF DATA OR DATA BEING REN- 
DERED INACCURATE OR LOSSES SUSTAINED 
BY YOU OR THIRD PARTIES OR A FAILURE 
OF THE PROGRAM TO OPERATE WITH ANY 
OTHER PROGRAMS), EVEN IF SUCH HOLDER 
OR OTHER PARTY HAS BEEN ADVISED OF 
THE POSSIBILITY OF SUCH DAMAGES. 17. In- 
terpretation of Sections 15 and 16. 

If the disclaimer of warranty and limitation of lia- 
bility provided above cannot be given local legal ef- 


fect according to their terms, reviewing courts shall 
apply local law that most closely approximates an 
absolute waiver of all civil liability in connection 
with the Program, unless a warranty or assumption 
of liability accompanies a copy of the Program in 
return for a fee. 

END OF TERMS AND CONDITIONS How to Ap- 
ply These Terms to Your New Programs 

If you develop a new program, and you want it to 
be of the greatest possible use to the public, the 
best way to achieve this is to make it free software 
which everyone can redistribute and change under 
these terms. 

To do so, attach the following notices to the pro- 
gram. It is safest to attach them to the start of 
each source file to most effectively state the exclu- 
sion of warranty; and each file should have at least 
the “copyright” line and a pointer to where the full 
notice is found. 

<one line to give the program’s name and a brief 
idea of what it does.> Copyright (C) <year> 
<name of author> 

This program is free software: you can redistribute 
it and/or modify it under the terms of the GNU 
General Public License as published by the Free 
Software Foundation, either version 3 of the Li- 
cense, or (at your option) any later version. 

This program is distributed in the hope that 
it will be useful, but WITHOUT ANY WAR- 
RANTY; without even the implied warranty of 
MERCHANTABILITY or FITNESS FOR A PAR- 
TICULAR PURPOSE. See the GNU General Public 
License for more details. 

You should have received a copy of the GNU Gen- 
eral Public License along with this program. If not, 
see <http: / /www.gnu.org/licenses/>. 

Also add information on how to contact you by elec- 
tronic and paper mail. 

If the program does terminal interaction, make it 
output a short notice like this when it starts in an 
interactive mode: 

<program> Copyright (C) <year> <name of au- 
thor> This program comes with ABSOLUTELY 
NO WARRANTY; for details type ‘show w’. This is 
free software, and you are welcome to redistribute it 
under certain conditions; type ‘show c’ for details. 

The hypothetical commands ‘show w’ and ‘show c’ 
should show the appropriate parts of the General 
Public License. Of course, your program’s com- 
mands might be different; for a GUI interface, you 
would use an “about box”. 

You should also get your employer (if you work 
as a programmer) or school, if any, to sign a 
“copyright disclaimer” for the program, if nec- 
essary. For more information on this, and 
how to apply and follow the GNU GPL, see 
<http://www.gnu.org/licenses/>. 

The GNU General Public License does not permit 
incorporating your program into proprietary pro- 
grams. If your program is a subroutine library, you 
may consider it more useful to permit linking pro- 
prietary applications with the library. If this is 
what you want to do, use the GNU Lesser General 
Public License instead of this License. But first, 
please read <http://www.gnu.org/philosophy/why- 
not-lgpl.html>. 


9.2 GNU Free Documentation License 


Version 1.3, 3 November 2008 

Copyright © 2000, 2001, 2002, 2007, 2008 Free Soft- 
ware Foundation, Inc. <http://fsf.org/> 

Everyone is permitted to copy and distribute verba- 
tim copies of this license document, but changing 
it is not allowed. 0. PREAMBLE 

The purpose of this License is to make a manual, 
textbook, or other functional and useful document 
"free" in the sense of freedom: to assure everyone 
the effective freedom to copy and redistribute it, 
with or without modifying it, either commercially 
or noncommercially. Secondarily, this License pre- 
serves for the author and publisher a way to get 
credit for their work, while not being considered 
responsible for modifications made by others. 

This License is a kind of "copyleft", which means 
that derivative works of the document must them- 
selves be free in the same sense. It complements 
the GNU General Public License, which is a copy- 
left license designed for free software. 

We have designed this License in order to use it 
for manuals for free software, because free software 
needs free documentation: a free program should 
come with manuals providing the same freedoms 
that the software does. But this License is not lim- 
ited to software manuals; it can be used for any tex- 
tual work, regardless of subject matter or whether 
it is published as a printed book. We recommend 
this License principally for works whose purpose is 
instruction or reference. 1. APPLICABILITY AND 
DEFINITIONS 

This License applies to any manual or other work, 
in any medium, that contains a notice placed by the 
copyright holder saying it can be distributed under 
the terms of this License. Such a notice grants a 
world-wide, royalty-free license, unlimited in dura- 
tion, to use that work under the conditions stated 
herein. The "Document", below, refers to any such 
manual or work. Any member of the public is a li- 
censee, and is addressed as "you". You accept the 
license if you copy, modify or distribute the work 
in a way requiring permission under copyright law. 

A "Modified Version" of the Document means any 
work containing the Document or a portion of it, ei- 
ther copied verbatim, or with modifications and/or 
translated into another language. 

A "Secondary Section" is a named appendix or a 
front-matter section of the Document that deals ex- 
clusively with the relationship of the publishers or 


authors of the Document to the Document’s overall 
subject (or to related matters) and contains noth- 
ing that could fall directly within that overall sub- 
ject. (Thus, if the Document is in part a textbook 
of mathematics, a Secondary Section may not ex- 
plain any mathematics.) The relationship could be 
a matter of historical connection with the subject 
or with related matters, or of legal, commercial, 
philosophical, ethical or political position regard- 

The "Invariant Sections" are certain Secondary Sec- 
tions whose titles are designated, as being those of 
Invariant Sections, in the notice that says that the 
Document is released under this License. If a sec- 
tion does not fit the above definition of Secondary 
then it is not allowed to be designated as Invariant. 
The Document may contain zero Invariant Sections. 
If the Document does not identify any Invariant 
Sections then there are none. 

The "Cover Texts" are certain short passages of text 
that are listed, as Front-Cover Texts or Back-Cover 
Texts, in the notice that says that the Document is 
released under this License. A Front-Cover Text 
may be at most 5 words, and a Back-Cover Text 
may be at most 25 words. 

A "Transparent" copy of the Document means a 
machine-readable copy, represented in a format 
whose specification is available to the general pub- 
lic, that is suitable for revising the document 
straightforwardly with generic text editors or (for 
images composed of pixels) generic paint programs 
or (for drawings) some widely available drawing ed- 
itor, and that is suitable for input to text format- 
ters or for automatic translation to a variety of for- 
mats suitable for input to text formatters. A copy 
made in an otherwise Transparent file format whose 
markup, or absence of markup, has been arranged 
to thwart or discourage subsequent modification by 
readers is not Transparent. An image format is not 
Transparent if used for any substantial amount of 
text. A copy that is not "Transparent" is called 

Examples of suitable formats for Transparent 
copies include plain ASCII without markup, Tex- 
info input format, LaTeX input format, SGML or 
XML using a publicly available DTD, and standard- 
conforming simple HTML, PostScript or PDF de- 
signed for human modification. Examples of trans- 
parent image formats include PNG, XCF and JPG. 
Opaque formats include proprietary formats that 
can be read and edited only by proprietary word 
processors, SGML or XML for which the DTD 
and/or processing tools are not generally available, 
and the machine-generated HTML, PostScript or 


PDF produced by some word processors for output 
purposes only. 

The "Title Page" means, for a printed book, the 
title page itself, plus such following pages as are 
needed to hold, legibly, the material this License 
requires to appear in the title page. For works in 
formats which do not have any title page as such, 
"Title Page" means the text near the most promi- 
nent appearance of the work’s title, preceding the 
beginning of the body of the text. 

The "publisher" means any person or entity that 
distributes copies of the Document to the public. 

A section "Entitled XYZ" means a named subunit 
of the Document whose title either is precisely XYZ 
or contains XYZ in parentheses following text that 
translates XYZ in another language. (Here XYZ 
stands for a specific section name mentioned below, 
such as "Acknowledgements", "Dedications", "En- 
dorsements", or "History".) To "Preserve the Title" 
of such a section when you modify the Document 
means that it remains a section "Entitled XYZ" ac- 
cording to this definition. 

The Document may include Warranty Disclaimers 
next to the notice which states that this License 
applies to the Document. These Warranty Dis- 
claimers are considered to be included by reference 
in this License, but only as regards disclaiming war- 
ranties: any other implication that these Warranty 
Disclaimers may have is void and has no effect on 
the meaning of this License. 2. VERBATIM COPY- 
ING 

You may copy and distribute the Document in any 
medium, either commercially or noncommercially, 
provided that this License, the copyright notices, 
and the license notice saying this License applies to 
the Document are reproduced in all copies, and that 
you add no other conditions whatsoever to those 
of this License. You may not use technical mea- 
sures to obstruct or control the reading or further 
copying of the copies you make or distribute. How- 
ever, you may accept compensation in exchange for 
copies. If you distribute a large enough number of 
copies you must also follow the conditions in sec- 

You may also lend copies, under the same condi- 
tions stated above, and you may publicly display 
copies. 3. COPYING IN QUANTITY 

If you publish printed copies (or copies in media 
that commonly have printed covers) of the Doc- 
ument, numbering more than 100, and the Doc- 
ument’s license notice requires Cover Texts, you 


must enclose the copies in covers that carry, clearly 
and legibly, all these Cover Texts: Front- Cover 

Texts on the front cover, and Back-Cover Texts 
on the back cover. Both covers must also clearly 
and legibly identify you as the publisher of these 
copies. The front cover must present the full title 
with all words of the title equally prominent and 
visible. You may add other material on the covers 
in addition. Copying with changes limited to the 
covers, as long as they preserve the title of the Doc- 
ument and satisfy these conditions, can be treated 
as verbatim copying in other respects. 

If the required texts for either cover are too volu- 
minous to fit legibly, you should put the first ones 
listed (as many as fit reasonably) on the actual 
cover, and continue the rest onto adjacent pages. 

If you publish or distribute Opaque copies of the 
Document numbering more than 100, you must ei- 
ther include a machine-readable Transparent copy 
along with each Opaque copy, or state in or with 
each Opaque copy a computer-network location 
from which the general network-using public has 
access to download using public-standard network 
protocols a complete Transparent copy of the Doc- 
ument, free of added material. If you use the lat- 
ter option, you must take reasonably prudent steps, 
when you begin distribution of Opaque copies in 
quantity, to ensure that this Transparent copy will 
remain thus accessible at the stated location until 
at least one year after the last time you distribute 
an Opaque copy (directly or through your agents or 
retailers) of that edition to the public. 

It is requested, but not required, that you con- 
tact the authors of the Document well before redis- 
tributing any large number of copies, to give them 
a chance to provide you with an updated version of 
the Document. 4. MODIFICATIONS 

You may copy and distribute a Modified Version of 
the Document under the conditions of sections 2 
and 3 above, provided that you release the Modi- 
fied Version under precisely this License, with the 
Modified Version filling the role of the Document, 
thus licensing distribution and modification of the 
Modified Version to whoever possesses a copy of it. 
In addition, you must do these things in the Modi- 
fied Version: 

* A. Use in the Title Page (and on the covers, if 
any) a title distinct from that of the Document, 
and from those of previous versions (which should, 
if there were any, be listed in the History section 
of the Document). You may use the same title as 
a previous version if the original publisher of that 
version gives permission. * B. List on the Title 



Page, as authors, one or more persons or entities 
responsible for authorship of the modifications in 
the Modified Version, together with at least five of 
the principal authors of the Document (all of its 
principal authors, if it has fewer than five), unless 
they release you from this requirement. * C. State 
on the Title page the name of the publisher of the 
Modified Version, as the publisher. * D. Preserve 
all the copyright notices of the Document. * E. Add 
an appropriate copyright notice for your modifica- 
tions adjacent to the other copyright notices. * F. 
Include, immediately after the copyright notices, a 
license notice giving the public permission to use 
the Modified Version under the terms of this Li- 
cense, in the form shown in the Addendum below. * 
G. Preserve in that license notice the full lists of In- 
variant Sections and required Cover Texts given in 
the Document’s license notice. * H. Include an unal- 
tered copy of this License. * I. Preserve the section 
Entitled "History", Preserve its Title, and add to it 
an item stating at least the title, year, new authors, 
and publisher of the Modified Version as given on 
the Title Page. If there is no section Entitled "His- 
tory" in the Document, create one stating the title, 
year, authors, and publisher of the Document as 
given on its Title Page, then add an item describ- 
ing the Modified Version as stated in the previous 
sentence. * J. Preserve the network location, if any, 
given in the Document for public access to a Trans- 
parent copy of the Document, and likewise the net- 
work locations given in the Document for previous 
versions it was based on. These may be placed in 
the "History" section. You may omit a network lo- 
cation for a work that was published at least four 
years before the Document itself, or if the original 
publisher of the version it refers to gives permission. 
* K. For any section Entitled "Acknowledgements" 
or "Dedications", Preserve the Title of the section, 
and preserve in the section all the substance and 
tone of each of the contributor acknowledgements 
and/or dedications given therein. * L. Preserve all 
the Invariant Sections of the Document, unaltered 
in their text and in their titles. Section numbers or 
the equivalent are not considered part of the section 
titles. * M. Delete any section Entitled "Endorse- 
ments". Such a section may not be included in the 
Modified Version. * N. Do not retitle any existing 
section to be Entitled "Endorsements" or to conflict 
in title with any Invariant Section. * O. Preserve 
any Warranty Disclaimers. 

If the Modified Version includes new front-matter 
sections or appendices that qualify as Secondary 
Sections and contain no material copied from the 
Document, you may at your option designate some 
or all of these sections as invariant. To do this, add 
their titles to the list of Invariant Sections in the 
Modified Version’s license notice. These titles must 
be distinct from any other section titles. 

You may add a section Entitled "Endorsements", 
provided it contains nothing but endorsements of 
your Modified Version by various parties — for ex- 
ample, statements of peer review or that the text 
has been approved by an organization as the au- 
thoritative definition of a standard. 

You may add a passage of up to five words as a 
Front-Cover Text, and a passage of up to 25 words 
as a Back-Cover Text, to the end of the list of Cover 
Texts in the Modified Version. Only one passage of 
Front-Cover Text and one of Back-Cover Text may 
be added by (or through arrangements made by) 
any one entity. If the Document already includes 
a cover text for the same cover, previously added 
by you or by arrangement made by the same entity 
you are acting on behalf of, you may not add an- 


other; but you may replace the old one, on explicit 
permission from the previous publisher that added 
the old one. 

The author(s) and publisher(s) of the Document do 
not by this License give permission to use their 
names for publicity for or to assert or imply en- 
dorsement of any Modified Version. 5. COMBIN- 
ING DOCUMENTS 

You may combine the Document with other docu- 
ments released under this License, under the terms 
defined in section 4 above for modified versions, 
provided that you include in the combination all 
of the Invariant Sections of all of the original doc- 
uments, unmodified, and list them all as Invariant 
Sections of your combined work in its license no- 
tice, and that you preserve all their Warranty Dis- 
claimers. 

The combined work need only contain one copy of 
this License, and multiple identical Invariant Sec- 
tions may be replaced with a single copy. If there 
are multiple Invariant Sections with the same name 
but different contents, make the title of each such 
section unique by adding at the end of it, in paren- 
theses, the name of the original author or publisher 
of that section if known, or else a unique number. 
Make the same adjustment to the section titles in 
the list of Invariant Sections in the license notice 
of the combined work. 

In the combination, you must combine any sections 
Entitled "History" in the various original docu- 
ments, forming one section Entitled "History"; like- 
wise combine any sections Entitled "Acknowledge- 
ments", and any sections Entitled "Dedications". 
You must delete all sections Entitled "Endorse- 
ments". 6. COLLECTIONS OF DOCUMENTS 

You may make a collection consisting of the Docu- 
ment and other documents released under this Li- 
cense, and replace the individual copies of this Li- 
cense in the various documents with a single copy 
that is included in the collection, provided that you 
follow the rules of this License for verbatim copying 
of each of the documents in all other respects. 

You may extract a single document from such a col- 
lection, and distribute it individually under this Li- 
cense, provided you insert a copy of this License 
into the extracted document, and follow this Li- 
cense in all other respects regarding verbatim copy- 
ing of that document. 7. AGGREGATION WITH 
INDEPENDENT WORKS 

A compilation of the Document or its derivatives 
with other separate and independent documents or 
works, in or on a volume of a storage or distribution 
medium, is called an "aggregate" if the copyright re- 
sulting from the compilation is not used to limit the 
legal rights of the compilation’s users beyond what 
the individual works permit. When the Document 
is included in an aggregate, this License does not 
apply to the other works in the aggregate which are 
not themselves derivative works of the Document. 

If the Cover Text requirement of section 3 is appli- 
cable to these copies of the Document, then if the 
Document is less than one half of the entire aggre- 
gate, the Document’s Cover Texts may be placed 
on covers that bracket the Document within the 
aggregate, or the electronic equivalent of covers 
if the Document is in electronic form. Otherwise 
they must appear on printed covers that bracket 
the whole aggregate. 8. TRANSLATION 


Translation is considered a kind of modification, so 
you may distribute translations of the Document 
under the terms of section 4. Replacing Invariant 
Sections with translations requires special permis- 
sion from their copyright holders, but you may in- 
clude translations of some or all Invariant Sections 
in addition to the original versions of these Invari- 
ant Sections. You may include a translation of this 
License, and all the license notices in the Document, 
and any Warranty Disclaimers, provided that you 
also include the original English version of this Li- 
cense and the original versions of those notices and 
disclaimers. In case of a disagreement between the 
translation and the original version of this License 
or a notice or disclaimer, the original version will 
prevail. 

If a section in the Document is Entitled "Acknowl- 
edgements", "Dedications", or "History", the re- 
quirement (section 4) to Preserve its Title (section 
1) will typically require changing the actual title. 
9. TERMINATION 

You may not copy, modify, sublicense, or distribute 
the Document except as expressly provided under 
this License. Any attempt otherwise to copy, mod- 
ify, sublicense, or distribute it is void, and will 
automatically terminate your rights under this Li- 
cense. 

However, if you cease all violation of this License, 
then your license from a particular copyright holder 
is reinstated (a) provisionally, unless and until the 
copyright holder explicitly and finally terminates 
your license, and (b) permanently, if the copyright 
holder fails to notify you of the violation by some 
reasonable means prior to 60 days after the cessa- 


Moreover, your license from a particular copyright 
holder is reinstated permanently if the copyright 
holder notifies you of the violation by some reason- 
able means, this is the first time you have received 
notice of violation of this License (for any work) 
from that copyright holder, and you cure the vi- 
olation prior to 30 days after your receipt of the 
notice. 

Termination of your rights under this section does 
not terminate the licenses of parties who have re- 
ceived copies or rights from you under this License. 
If your rights have been terminated and not perma- 
nently reinstated, receipt of a copy of some or all 
of the same material does not give you any rights 
to use it. 10. FUTURE REVISIONS OF THIS LI- 
CENSE 

The Free Software Foundation may publish new, re- 
vised versions of the GNU Free Documentation Li- 
cense from time to time. Such new versions will be 
similar in spirit to the present version, but may dif- 
fer in detail to address new problems or concerns. 
See http://www.gnu.org/copyleft/. 

Each version of the License is given a distinguish- 
ing version number. If the Document specifies that 
a particular numbered version of this License "or 
any later version" applies to it, you have the op- 
tion of following the terms and conditions either of 
that specified version or of any later version that 
has been published (not as a draft) by the Free Soft- 
ware Foundation. If the Document does not specify 
a version number of this License, you may choose 
any version ever published (not as a draft) by the 
Free Software Foundation. If the Document speci- 
fies that a proxy can decide which future versions of 


this License can be used, that proxy’s public state- 
ment of acceptance of a version permanently autho- 
rizes you to choose that version for the Document. 
11. RELICENSING 

"Massive Multiauthor Collaboration Site" (or 
"MMC Site") means any World Wide Web server 
that publishes copyrightable works and also pro- 
vides prominent facilities for anybody to edit those 
works. A public wiki that anybody can edit is 
an example of such a server. A "Massive Multiau- 
thor Collaboration" (or "MMC") contained in the 
site means any set of copyrightable works thus pub- 
lished on the MMC site. 

"CC-BY-SA" means the Creative Commons 
Attribution-Share Alike 3.0 license published by 
Creative Commons Corporation, a not-for-profit 
corporation with a principal place of business in 
San Francisco, California, as well as future copyleft 
versions of that license published by that same 
organization. 

"Incorporate" means to publish or republish a Doc- 
ument, in whole or in part, as part of another Doc- 
ument. 

An MMC is "eligible for relicensing" if it is licensed 
under this License, and if all works that were first 
published under this License somewhere other than 
this MMC, and subsequently incorporated in whole 
or in part into the MMC, (1) had no cover texts or 
invariant sections, and (2) were thus incorporated 
prior to November 1, 2008. 

The operator of an MMC Site may republish an 
MMC contained in the site under CC-BY-SA on the 
same site at any time before August 1, 2009, pro- 
vided the MMC is eligible for relicensing. ADDEN- 
DUM: How to use this License for your documents 

To use this License in a document you have written, 
include a copy of the License in the document and 
put the following copyright and license notices just 
after the title page: 

Copyright (C) YEAR YOUR NAME. Permission is 
granted to copy, distribute and/or modify this doc- 
ument under the terms of the GNU Free Documen- 
tation License, Version 1.3 or any later version pub- 
lished by the Free Software Foundation; with no 
Invariant Sections, no Front-Cover Texts, and no 
Back-Cover Texts. A copy of the license is included 
in the section entitled "GNU Free Documentation 
License" . 

If you have Invariant Sections, Front-Cover Texts 
and Back-Cover Texts, replace the "with . . . 
Texts." line with this: 

with the Invariant Sections being LIST THEIR TI- 
TLES, with the Front-Cover Texts being LIST, and 
with the Back-Cover Texts being LIST. 

If you have Invariant Sections without Cover Texts, 
or some other combination of the three, merge 
those two alternatives to suit the situation. 

If your document contains nontrivial examples of 
program code, we recommend releasing these exam- 
ples in parallel under your choice of free software 
license, such as the GNU General Public License, 
to permit their use in free software. 


9.3 GNU Lesser General Public License 


GNU LESSER GENERAL PUBLIC LICENSE 
Version 3, 29 June 2007 

Copyright © 2007 Free Software Foundation, Inc. 
<http://fsf.org/> 

Everyone is permitted to copy and distribute verba- 
tim copies of this license document, but changing 
it is not allowed. 

This version of the GNU Lesser General Public Li- 
cense incorporates the terms and conditions of ver- 
sion 3 of the GNU General Public License, supple- 
mented by the additional permissions listed below. 
0. Additional Definitions. 

As used herein, “this License” refers to version 3 
of the GNU Lesser General Public License, and the 
“GNU GPL” refers to version 3 of the GNU General 
Public License. 

“The Library” refers to a covered work governed by 
this License, other than an Application or a Com- 
bined Work as defined below. 

An “Application” is any work that makes use of an 
interface provided by the Library, but which is not 
otherwise based on the Library. Defining a subclass 
of a class defined by the Library is deemed a mode 
of using an interface provided by the Library. 

A “Combined Work” is a work produced by com- 
bining or linking an Application with the Library. 
The particular version of the Library with which 
the Combined Work was made is also called the 
“Linked Version”. 

The “Minimal Corresponding Source” for a Com- 
bined Work means the Corresponding Source for 
the Combined Work, excluding any source code for 
portions of the Combined Work that, considered in 
isolation, are based on the Application, and not on 
the Linked Version. 


The “Corresponding Application Code” for a Com- 
bined Work means the object code and/or source 
code for the Application, including any data and 
utility programs needed for reproducing the Com- 
bined Work from the Application, but excluding the 
System Libraries of the Combined Work. 1. Excep- 
tion to Section 3 of the GNU GPL. 

You may convey a covered work under sections 3 
and 4 of this License without being bound by sec- 
tion 3 of the GNU GPL. 2. Conveying Modified 
Versions. 

If you modify a copy of the Library, and, in your 
modifications, a facility refers to a function or data 
to be supplied by an Application that uses the fa- 
cility (other than as an argument passed when the 
facility is invoked), then you may convey a copy of 
the modified version: 

* a) under this License, provided that you make a 
good faith effort to ensure that, in the event an Ap- 
plication does not supply the function or data, the 
facility still operates, and performs whatever part 
of its purpose remains meaningful, or * b) under 
the GNU GPL, with none of the additional permis- 
sions of this License applicable to that copy. 

3. Object Code Incorporating Material from Li- 
brary Header Files. 

The object code form of an Application may incor- 
porate material from a header file that is part of 
the Library. You may convey such object code un- 
der terms of your choice, provided that, if the in- 
corporated material is not limited to numerical pa- 
rameters, data structure layouts and accessors, or 
small macros, inline functions and templates (ten 
or fewer lines in length), you do both of the follow- 
ing: 

* a) Give prominent notice with each copy of the 
object code that the Library is used in it and that 
the Library and its use are covered by this License. 

* b) Accompany the object code with a copy of the 
GNU GPL and this license document. 


4. Combined Works. 


You may convey a Combined Work under terms of 
your choice that, taken together, effectively do not 
restrict modification of the portions of the Library 
contained in the Combined Work and reverse en- 
gineering for debugging such modifications, if you 
also do each of the following: 

* a) Give prominent notice with each copy of the 
Combined Work that the Library is used in it and 
that the Library and its use are covered by this Li- 
cense. * b) Accompany the Combined Work with a 
copy of the GNU GPL and this license document. * 
c) For a Combined Work that displays copyright no- 
tices during execution, include the copyright notice 
for the Library among these notices, as well as a ref- 
erence directing the user to the copies of the GNU 
GPL and this license document. * d) Do one of the 
following: o 0) Convey the Minimal Corresponding 
Source under the terms of this License, and the Cor- 
responding Application Code in a form suitable for, 
and under terms that permit, the user to recombine 
or relink the Application with a modified version 
of the Linked Version to produce a modified Com- 
bined Work, in the manner specified by section 6 of 
the GNU GPL for conveying Corresponding Source, 
o 1) Use a suitable shared library mechanism for 
linking with the Library. A suitable mechanism 
is one that (a) uses at run time a copy of the Li- 
brary already present on the user’s computer sys- 
tem, and (b) will operate properly with a modified 
version of the Library that is interface-compatible 
with the Linked Version. * e) Provide Installation 
Information, but only if you would otherwise be re- 
quired to provide such information under section 6 
of the GNU GPL, and only to the extent that such 
information is necessary to install and execute a 
modified version of the Combined Work produced 
by recombining or relinking the Application with 
a modified version of the Linked Version. (If you 
use option 4d0, the Installation Information must 
accompany the Minimal Corresponding Source and 
Corresponding Application Code. If you use option 
4dl, you must provide the Installation Information 
in the manner specified by section 6 of the GNU 
GPL for conveying Corresponding Source.) 


5. Combined Libraries. 

You may place library facilities that are a work 
based on the Library side by side in a single library 
together with other library facilities that are not 
Applications and are not covered by this License, 
and convey such a combined library under terms of 
your choice, if you do both of the following: 

* a) Accompany the combined library with a copy 
of the same work based on the Library, uncombined 
with any other library facilities, conveyed under 
the terms of this License. * b) Give prominent no- 
tice with the combined library that part of it is a 
work based on the Library, and explaining where 
to find the accompanying uncombined form of the 
same work. 

6. Revised Versions of the GNU Lesser General 
Public License. 

The Free Software Foundation may publish revised 
and/or new versions of the GNU Lesser General 
Public License from time to time. Such new ver- 
sions will be similar in spirit to the present version, 
but may differ in detail to address new problems or 
concerns. 

Each version is given a distinguishing version num- 
ber. If the Library as you received it specifies that 
a certain numbered version of the GNU Lesser Gen- 
eral Public License “or any later version” applies to 
it, you have the option of following the terms and 
conditions either of that published version or of any 
later version published by the Free Software Foun- 
dation. If the Library as you received it does not 
specify a version number of the GNU Lesser Gen- 
eral Public License, you may choose any version of 
the GNU Lesser General Public License ever pub- 
lished by the Free Software Foundation. 

If the Library as you received it specifies that a 
proxy can decide whether future versions of the 
GNU Lesser General Public License shall apply, 
that proxy’s public statement of acceptance of 
any version is permanent authorization for you to 
choose that version for the Library. 



